Oslo's bike sharing system Bysykkel is a convenient and popular way to get across town. It can dramatically shorten commuting times and at the same time it is fun and healthy to use. As a trusted user of this system myself, I often noticed how bike stations downtown tend to fill up, while stations further out tend to run out of bikes. This behavior sparked my curiosity. My gut feeling was that this pattern is caused by Oslo's topography, which gently slopes from downtown areas at the waterfront upwards to residential areas. I got the suspicion that people are simply too lazy to bike up again. I set out to prove it.
The goal of this project is to investigate how the topography of Oslo affects the flow of bikes. Furthermore, I am trying to understand how the system works, how the network is structured, and if there are clear temporal patterns in the flow of bikes. To make this approach more structured, I am trying to answer the following questions:
Historical data can be freely accessed (Oslo Bysykkel Historical Data) and dates back all the way to 2019. The data is structured in CSV files, one for each month, and contains records of all rides carried out during that month. Each row in the dataset represents one ride and includes the following information (list not exhaustive):
The full analysis with all the nitty gritty details can be found on my GitHub page. This article will only concentrate on the findings and the results, and won't bore you with how I got there. The project was done in a Python Jupyter Notebook using SQL and DuckDB for data handling and cleaning, and seaborn, matplotlib and Folium for data visualization. For all the details, see here:
This article is structured as follows: I will start out by investigating how the topography of Oslo affects the flow of bikes in Topographical Analysis. Afterwards, in Network Structure Analysis, I will investigate how the network structure is built up and what the most important stations and bike pathways are. Finally, in Temporal Pattern Analysis, I will examine what temporal patterns are hidden in the flow of bikes.
In order to investigate the effect of elevation on cyclists' willingness to pedal up the hill back home, I first had to enrich the data with additional features:
Using these values, I can start to form an idea on cycling behavior.
This question can best be answered by looking at the distribution of gradients for all trips, and see whether this distribution is skewed towards uphill or downhill travel, or if it is unbiased.
The distribution above clearly shows that downhill travel is more prevalent than uphill travel, since it is visually skewed towards negative values. Another feature that strikes the eye is a pronounced spike at gradient 0%, which likely represents popular rides along flat waterfront paths. The analysis revealed the following statistics:
These numbers strongly support the idea that downhill travel is more popular, but we can dig deeper.
Counting uphill vs. downhill trips alone might be misleading or not show the full picture. While downhill trips could be more frequent, more time could actually be spent on uphill trips, since these tend to take longer. To answer this question, I split the data into uphill, downhill, and flat trips, and examine the individual distributions for duration and distance.
The figure above clearly demonstrates that downhill travel is more popular across several metrics. To be specific:
The fact that downhill travel prevails has the logical consequence that uphill stations would constantly lose bikes, while they accumulate at downhill ones. To quantify this effect, I compute the net bike flow (i.e. bike flux) for each station. This value quantifies how many bikes a station lost or gained during the course of the year 2024. If a station mainly serves as the starting point for bike rides, it has a negative flux. On the other hand, if a station mainly serves to return bikes, it has a positive flux. According to this logic, these stations can consequently be classified as either an exporter or importer station.
Now let's explore whether elevation is the key factor in determining whether a station tends to import or export bikes. To do this, I plot each station's net flux against its elevation and look for patterns in the relationship.
The figure above clearly demonstrates how elevation causes station imbalances in Oslo's bike sharing system. Most stations below ~35m act as importer stations, while the situation above ~35m is reversed:
This has the consequence that bikes naturally "flow" downhill and accumulate near sea level. This creates an imbalance, where higher elevation stations consistently need bike restocking, while lower elevation ones consistently need bikes removed to create docking space. The ~35m elevation mark acts as a "watershed" for bike flow in the system.
Having established that there's a strong relationship between elevation and station imbalance, let's now visualize these imbalances on a map.
The interactive map below shows all bike stations with:
The interactive map above shows the geographic pattern of the imbalanced flow of bikes in the bike sharing system. This map paints the picture very clearly. In the city center near sea level is a large cluster of high-usage importer stations (green), which gobble up a large number of bikes throughout the day and act as bike sinks. Around this cluster is a diffuse semicircle of balanced stations where bike in and out flow cancels out. This ring is located at roughly 35m of elevation. Surrounding this is another diffuse ring of exporter stations (red) located at higher elevations in peripheral areas. These act as a large bike source and feed bikes into the system.
This situation has a huge impact on system operation. The top 20 most critical stations (just 7.7% of the network) require moving ~300 bikes daily!
The elevation analysis above painted an interesting picture: Bikes flow from higher to lower elevations and thereby create station imbalances. But how exactly do bikes move through the city? Elevation only gives a rough clue of the direction of the flow, but now I want to find out how all those trips connect the city together. By mapping every ride as a connection between stations, I can see which areas in town are busiest and which routes are most popular.
The interactive map below shows the result of this analysis. Drawn are the 600 most travelled connections. Line width indicates the connection frequency, and station type (exporter/importer) is indicated by marker color. The size of the marker is proportional to station usage.
The map above shows that the bike network can be split into two parts: A core network, which consists of densely connected central stations such as Aker Brygge, Oslo S. and Torggata. These act as bike importers. They consistently receive more bikes than they send out. This is the zone where the majority of the bike rides take place. Surrounding this core is a feeder network of peripheral stations at higher elevations that primarily export bikes downhill into the city center.
The top routes fall into two categories: There are leisure routes, which are short rides along the waterfront, like Oslo S ↔ Vippetangen or Aker Brygge ↔ Frognerstranda. These trips are most frequent, but make up only about 1% of all rides. The bulk of trips are commuter routes, which are high-frequency connections between busy stations like Alexander Kiellands Plass ↔ Torggata or Oslo S ↔ Kværnerbyen. These are the backbone of the network.
The analysis so far has averaged patters across the entire year, smoothing out all temporal variations. We now know that there are importer and exporter stations, but when during the day do these exports and imports actually occur? This section explores the timing of these flows.
Let's look at how bike flows develop throughout the day. To get an overview of the daily pattern, I grouped the dataset by "hour of day" and then counted the hourly arrivals and departures. This gives us the flow pattern during a day, averaging out any seasonal variations such as winter and summer or weekends and weekdays, retaining only variations caused by the time of day.
Figure 6 shows the hourly bike flow for the top 5 exporter and importer stations.
The flow of bikes for the top most imbalanced stations shows some very striking features. All exporter stations (red lines) remain largely negative throughout the day, indicating that bikes mainly flow one-way. There is a long morning rush from 05 to 09 o'clock, with some local exceptions: Lindern (UllevĂĄl Hospital) features a brief import spike at 06-07 and two large export spikes, one at 15:00 (probably hospital day shift ending) and another at 22:00 (evening shift ending). BI Nydalen has a brief import spike at 07-08, probably students and employees arriving at the BI campus. The most important feature however is that none of these stations indicates a return of bikes in the evening.
Importer stations (green lines) located in the central areas of Oslo show large positive spikes, with peaks reaching 6+ bikes per hour net inflow. After a strong morning influx (07-09) they sustain high import levels throughout the day. The afternoon patterns (14-17) are quite variable. Oslo S and Aker Brygge spike again, probably because they are major transport hubs and important areas for leisure activities. Other core stations show more variable patterns, with some even briefly exporting bikes. Torggata is the top evening importer. This area is a popular destination for evening entertainment, dining and drinks. People bike there after work for social activities.
Now, lets look at how the hourly bike flow looks like for the top-used balanced stations.
Balanced stations feature the complete opposite behavior. The lines cross y=0, which indicates a two-way flow. There is no persistent directional bias. Bikes flow both in and out throughout the day, with a prominent export spike at 06-09 when people leave for work, and an equally prominent import spike at 15-18 when people return home after work. Alexander Kiellands Plass shows the most extreme pattern. A dramatic morning export (-7 bikes/hour) is followed by a strong afternoon import (6+ bikes/hour).
Let's investigate how the daily flow pattern differs between weekdays and weekends. The question to answer is whether the system is more popular for weekday commutes or weekend leisure rides. First, I will look at the daily trip count for each day of the week to see when in the week the majority of rides take place.
The figure above shows the average number of rides by day of the week. While intuition might suggest that the bike sharing system is predominantly used for leisure rides, the plot above paints a different picture. Rides on the weekend are actually 37% less frequent than on weekdays. This shows that Oslo's bike sharing system is an integrated component in the city's transportation infrastructure and frequently used for commuting trips.
Now let's examine how the station imbalance patterns we identified earlier differ between weekdays and weekends. Using the same net flow analysis from before, I'll split the data into weekday and weekend to see if commuter and leisure behavior creates different flow patterns.
The figure above shows the hourly flow pattern for the top importer and exporter stations, evaluated for weekdays on the left and weekends on the right. The weekday flux patterns look almost identical to the earlier analysis of the full dataset. The huge morning and afternoon spikes are primarily a weekday phenomenon. The net flow on the weekend is a lot smaller than during the week and shows less variation throughout weekend days.
This analysis set out to understand how Oslo's bike sharing system really works. It turned out that the movement of bikes is not completely random, but is in fact guided by very predictable patterns. Three important aspects control the movements of bikes around town.
The gravity problem: The topographical analysis showed that cyclists have a strong preference for downhill over uphill travel. 59.6% of all rides go downhill while only 38.8% go uphill. People tend to ride bikes from higher elevations down to lower elevations in the city center, and then take other means of transportation back home. This leads to a depletion of bikes at elevations over ~35 meters, and stations lower than that consistently accumulate bikes. Without proper countermeasures, uphill stations would soon run out of bikes while downhill stations would run out of docking capacity.
Two-zone structure: The network analysis showed that the network can be divided into two parts. The first is the core network in downtown areas where most of the rides take place. This area acts as a bike sink where stations such as Aker Brygge, Oslo S and Torggata persistently import bikes throughout the day. Surrounding this core network is the feeder network of peripheral stations, which primarily export bikes downhill each day. The most popular routes fall into two categories: Short waterfront leisure rides and high-frequency commuter rides between major downtown hubs.
Weekday travel dominates: The temporal analysis showed that weekdays dominate the bike-sharing network. Weekday trips are 37% more frequent than weekend trips. Different stations show very different bike fluxes throughout the day. Balanced stations function as large bike exporters in the morning and as large importers in the evening. Downtown stations are mainly importers throughout the entire day, while peripheral stations mainly export. This pattern is much less extreme on weekends and becomes flattened.
These three effects drive the bike flow in Oslo. Understanding these patterns shows that bike rebalancing in Oslo isn't random maintenance, but a daily battle against gravity. Restocking and bike removal needs follow predictable patterns controlled by time and location. Oslo Bysykkel clearly does a great job managing these challenges. The system works well despite the difficult topography. With insights like these, operators could potentially schedule their trucks even more efficiently and predict which stations will be empty by rush hour, and maybe even prevent those frustrating moments when you can't find a bike or a place to park one.
What started as a short weekend project turned into a full-blown rabbit hole. But it reminded me why I love data science: You can turn seemingly random data into something meaningful.
Huge thanks to the Bysykkel team for making these data publicly available, and for their daily effort of battling gravity to keep bikes where we need them!