Programming Crickles
R and Python
Here, as a diversion between Christmas and New Year, is a post for any Crickles users with a programming interest.
Crickles is entirely programmed in R and Python with the vast majority done in R. Python is used for AWS lambdas. The http server that manages interaction with Intervals perhaps ought to be in Python too - it has been in Python in the past (before Intervals displaced Strava as our source of data) and will probably be re-written in Python again in future. For now, it happens to be in R.
All offline analysis as well as the Navigator web app are written in R. Since there are now many more programmers using Python than R you might legitimately wonder why this is so.
To illustrate why R is preferred for this particular kind of work I asked Claude to generate a fair comparison of how the two languages do or could handle the sustainable performance analysis that is fully described, with code snippets, on this website here. Since the Polars library has substantially enhanced Python’s ability to handle dataframes, I asked Claude to give specific attention to where this could be beneficial.
Claude generated this comparison. As you can see if you read it, Claude concludes that:
Both R and Python are fully capable of handling the analysis.
With Polars, Python could complete the computation more quickly.
The code in R is more readable, concise and elegant than in Python.
Since this analysis is conducted occasionally and is offline, the third factor is more important than the second. In fact, little to no heavy-duty processing in Crickles occurs at run-time and so R usually wins for the same reason.
As it happens, the kinds of machine learning (ML) model used by Crickles are better supported in R whereas different types of ML model that could (but aren’t) used have much better support in Python; but this isn’t a factor in the analysis of sustainable performance analysed by Claude.
In Claude’s paper the R code is my own whereas the Python code is generated by Claude.