Forecast event attendance

fundraisingintermediateemerging

The problem

You're planning an event and guessing attendance. Over-estimate and you've wasted money on catering, venue space, printed materials. Under-estimate and you disappoint people (ran out of food, overcrowded room). You're using gut feel or 'last year's numbers' but attendance varies by: time of year, day of week, topic, weather, competing events. You need better predictions to plan resources efficiently.

The solution

Build a forecasting model using historical event data. It learns patterns: how registrations convert to attendance, seasonal effects (summer events get lower turnout), day-of-week impact (Saturday vs Tuesday), topic popularity, how far in advance people register. For new events it predicts: likely attendance range, optimal catering numbers, whether you need overflow space. Data-driven planning instead of guesswork.

What you get

Attendance forecast for upcoming event: 'Expected attendance: 75 people (range: 65-85 with 80% confidence). Based on: 95 registered, historical 79% show-up rate, Saturday event (+10% typical), summer month (-5% typical), similar topic averaged 72 attendees. Catering recommendation: 80 portions. Overflow planning: not needed unless registrations exceed 100.'

Before you start

Historical event data: registrations, actual attendance, date, day of week, topic/type
At least 15-20 past events to identify patterns
Understanding of factors that affect your attendance (topic, timing, format)
A Google account for Colab
Basic Python skills or willingness to adapt example code

When to use this

You run regular events and struggle to predict turnout accurately
You're wasting money on over-catering or disappointing people with under-capacity
Attendance varies significantly between events and you don't understand why
You've got historical event data to learn patterns from

When not to use this

You run very few events (under 10/year) - patterns unclear
You don't have historical attendance data - model needs training data
Every event is completely unique - no patterns to learn
Your events are always at capacity (waiting lists) - forecasting doesn't help, you need bigger venues

Steps

1
Gather historical event data
Export past event data: registrations count, actual attendance, date, day of week, topic/event type, format (in-person/online/hybrid), whether it was free or paid, any special factors (celebrity speaker, bad weather). Before uploading to Google Colab, strip out personal data (names, emails) - you only need anonymised counts and categories. You need both inputs (what was planned) and outcomes (who actually came). Minimum 15-20 events, though note that with only 15-20 events the Mean Absolute Error gives a rough guide rather than statistical certainty.
2
Calculate key metrics
For each past event, calculate: show-up rate (attendance/registrations), whether it was over/under capacity, seasonal timing (month, quarter), lead time (how far in advance people registered). These metrics help identify patterns: summer events might have 70% show-up vs 85% in autumn.
3
Identify attendance factors
What affects turnout for your events? Typical patterns: day of week (weekends vs weekdays), season (summer lower), topic (popular vs niche), price (free vs paid), format (online easier to skip), weather (for in-person). List 5-7 factors that matter in your context. These become model features.
4
Build the forecasting model
Use the example code (Random Forest regression) to learn patterns from historical data. The model identifies: 'Saturday events get 15% higher attendance', 'Summer months -10%', 'Popular topics +20%', 'Online events have 65% show-up vs 80% in-person'. It learns what combinations predict high/low turnout. Note: the model needs to be retrained if a completely new event type or topic is introduced that wasn't in the training data.
5
Validate model accuracy
Test on events the model hasn't seen: how close are predictions to actual attendance? If typically within 10-15 people, that's useful for planning. If predictions are off by 50%, you need more data or different features. Check: does it make intuitive sense? (Saturday events predicted higher than Tuesday - that tracks?).
6
Forecast upcoming events
For new events, input: registrations so far, planned date, day of week, topic, format. Model predicts: expected attendance with confidence range. If you've got 80 registrations for Saturday workshop, model might predict: 65 attendees (range 55-75). That's actionable for planning.
7
Use forecasts for resource planning
Turn predictions into decisions: catering numbers (forecast + 10% buffer), venue capacity needed, printed materials, staff allocation. If forecast is 65 with range 55-75, order catering for 70-75 (safe margin), book room for 80 (don't want cramped), print 70 handouts. Data-informed resource decisions.
8
Track and improve(optional)
After each event, record: forecast vs actual. Were you close? If consistently over/under-predicting, adjust. Feed actual results back into model (retrain monthly). Model improves as it learns from more events. Track forecast accuracy over time - it should get better.

Example code