This is an installment of our “Community Member Spotlight” series, where we invite our customers to share their work, shining a light on their success and inspiring others with new ways to use technology to solve problems.
In this edition, Rob Robinett and Gwyn Griffiths, the creators of WsprDaemon, join us to share the work they’re doing to allow amateur radio enthusiasts to analyze transmission data and understand trends, be it their own personal noise levels or much larger space weather patterns.
Amateur radio is a hobby for some three million people worldwide (see “What is Amateur Radio?” to learn more) and its technical scope is vast, examples include: designing and building satellites, devising bandwidth-efficient data communications protocols, and creating novel, low noise antennas for use in urban locations.
Our project, WsprDaemon, focuses on amateurs who use the (amateur-developed!) open-source Weak Signal Propagation Reporter (WSPR): a protocol that uses low-power radio transmissions that probe the earth’s ionosphere to identify radio propagation paths and provide insights on space weather. On a typical day, 2,500 amateurs may report 2.7 million WSPR “spots” to the wsprnet database, whose webpage interface allows simple queries on data collected in the last two weeks.
Radio signals that end up in the WsprDaemon TimescaleDB database may be received on a wide variety of antennas, from the 94-foot tower-supported multiband array in Northern Utah pictured above, to more modest 3-foot installations that you may see in many suburban or urban locations.
About the Team
We have a small, two-member volunteer core team, and we’re supported by a dozen or so beta testers and radio specialists (we typically have people from six countries on a weekly Zoom meeting). Rob Robinett, based in Berkeley California, is CEO of TV equipment manufacturer Mystic Video and he’s founded a series of Silicon Valley startups. He recently “rediscovered” amateur radio - after an absence of more than 40 years - and he's applying his software expertise to developing systems that measure short wave radio transmission conditions.
Gwyn Griffiths, based in Southampton, UK, returned to amateur radio after retiring from a career as an ocean technologist, where he worked with sensors and data from ships, undersea moorings, and robotics underwater vehicles. Gwyn focuses on the TimescaleDB components, devises Grafana dashboards to help inspire WsprDaemon users to create their own, and writes our step-by-step guides (check them out here).
About the project
WsprDaemon ingests data from the wsprnet database into TimescaleDB, allowing users to access historical data (remember, the wsprnet database shows online data from the last two weeks, and our project allows users to use older data for more robust analysis) and enabling a rich range of SQL queries.
Additionally, TimescaleDB facilitates our Grafana dashboards; seeing a month of “spots” gives users a far deeper understanding about their own data, enables comparisons with other datasets, and provides a platform for further experimentation and creative graphics.
Our TimescaleDB application caters to a wide spectrum of radio amateurs, from casual hobbyists to third-party developers:
- Some hobbyists simply want to see lists of who’s heard their transmissions in the last hour, or who they heard, at what strength, and where the transmissions originated.
- Other users want to display transmission metrics as time-series graphs, while there’s another class of users for whom ability to use aggregate functions, apply defined time buckets, derive statistics, and create heat maps and other visualizations is essential (such as the international Ham Radio Science Citizen Investigation community).
- Last, third-party app developers, like the VK7JJ listing, WSPR watch, and other mapping and graphing apps, also access our WSPR data, appreciating the fast query response.
The key measurement for a WSPR receiver is the signal-to-noise ratio (SNR): how strong an incoming signal is compared with the background noise. But, there is also vital metadata, including the location of the receiver and transmitter, the operating radio frequency, and most critically - time. On average, our database takes in about 4,000 “sets” of this data from a given transmitter, typically 400kB, every two minutes.
This below shows an example of SNR from three transmitters, in New York State, Italy, and Virginia.
The seven-day time-series data shown in this dashboard example provides rich information for its receiver, station KD2OM in New York State:
- They consistently hear N2AJX, just 16km distant, whose radio waves will have travelled over the ground, at much the same SNR throughout the day.
- They hear WA4KFZ in Virginia throughout most days – but with a dramatic variation in SNR. It’s at a minimum in the hours before sunrise (all times are UTC), and with peaks above the local station. This is the ionosphere at work, providing a path with less loss over 1000km than 16km for over-the-ground waves. The time-series view also allows us to see day-to-day variations, such as the shorter period of high SNR on 23rd June compared to prior days.
- They hear IU0ICC from Italy via the ionosphere from early evening to about 0300 local time each day, with a consistent shape to the rise and fall of SNR.
While SNR is the main measurement, our users are also interested in aggregate metadata functions, which provide an overview of daily changes in the ionosphere. Our project allows them to run these more complex queries, and we bring in complementary public domain data, such as the case below where we pull in data from the US Space Weather Prediction Center.
In this example, the top panel of the Grafana dashboard uses the WsprDaemon dataset to display a simple count of the “spots” in each 10 minute time bucket with, on the second y-axis, a measure of the planetary geomagnetic disturbance index (kp) from the US Space Weather Prediction Center. In 2020, we’re at the minimum of the 11-year sunspot cycle, so our current space weather is generally very quiet, but we’re anticipating disturbances - as well as more WSPR spots - as the sun becomes more active over the next four years.
The second panel is a heat map that shows the variation in distance between the receiver in Belgium and the transmitters it’s heard over time.
The third panel shows the direction of arrival at the receiver, while the bottom panel helps the user interpret all of this data, showing the local noise level and instances of overload.
Editor’s note: see our Grafana Series Override blog post to learn how (and why) to use two y-axes to more accurately plot your data and understand trends.
As evidenced above, a big advantage for our users is our ability to bring disparate datasets together into one database, with one graphical visualisation tool.
Our initial need was for a database and visualisation tool for radio noise measurements from a handful of stations. A colleague suggested Influx and Grafana, and kindly set up a prototype for us. We were hooked.
We sought to expand to cover a larger range of data sets from several thousand sources. The Influx documentation was great, and we had an extended application running quickly. Initially, our query time performance was excellent, but, as we accumulated weeks of data we hit the cardinality issue. Query times became unacceptably long, and we looked for an alternative time-series database solution.
We quickly came across an objective article on how to solve the cardinality problem that led us to adopt TimescaleDB. The biggest factor in choosing TimescaleDB was that it solved our cardinality problem, but there were also “nice to have” features, such as the easy-to-use tool to migrate our data from Influx and the underlying use of PostgreSQL. But, we did miss Influx’s comprehensive single-source documentation.
Editor’s Note: Because we think it’s important to remain balanced and let our community members’ voice shine through, we don’t edit mentions of alternative technologies (favorable or unfavorable 🙂).
Current deployment & future plans
Our initial TimescaleDB implementation is on a DigitalOcean Droplet (2 cores, 4GB memory 100GB SSD disk), but we are moving to our own 16 core, 192GB memory Dell server and a back-up (we’re evaluating query performance as our user base grows).
As noted above, the way TimescaleDB has solved the issue of cardinality was a big selling point for us, and it’s what makes the WsprDaemon site performant for our users.
- When we were using Influx, a typical query that returned 1,000 results from a table of 12 million records and a cardinality of about 400,000 took 15-25 seconds.
- Now, running TimescaleDB on the same Digital Ocean Droplet (albeit with 4GB rather than the previous 2GB of memory), those same queries overwhelmingly return results in under 2s*.
- *as long as the data requested is within the chunk that is in memory. That’s why we’ve recently increased our Dell server memory from 24 to 192GB, to handle one-month chunks, and why it will become our primary machine.
We use bash Linux shell scripts with Python to gather the data that populates our database tables. We find that batch upload using psycopg2.extras.execute_batch works well for us, and our users use a variety of methods to access WsprDaemon, including Node.js and psql via its command line interface.
We already make extensive use of Grafana dashboards, and we expect to expand our capabilities - adding global map panels is just one example. But, even after extensive searching, it’s not always straightforward or obvious how to obtain the best, streamlined end result.
For example, creating an animation that shows the global geographic distribution of receivers and transmitters by hour requires us to export data to CSV using psql, import the file into Octave, generate maps on an azimuthal equidistant projection, save these map files as PNG, and then import into ImageJ to generate an AVI file.
Our future path includes collaboration, both with others in the global amateur radio community and more data sources. We continually learn about people who have neat solutions for handling and visualising data, and, by sharing knowledge and experience, we can collectively grow and improve the tools we offer this great community. We’re keen to expand our connections to other third party data sources, such as space weather, to help our users better interpret their results.
Getting started advice & resources
We’re non-professional database users, so we only feel qualified to speak to others with a similar minimal level of prior familiarity.
As you evaluate time-series database options, of course, read independent comparisons and the links they provide, but also look carefully at insightful, fair-minded comparisons from TimescaleDB, e.g., on SQL vs Flux. Try to assess the advantages of different approaches for your application, current and future skill sets, and requirements.
We believe that data analytics for radio amateurs is in its infancy. We’re detailing our approach and experience with TimescaleDB and Grafana in a paper at the 39th gathering of the Digital Communications Conference in September 2020 (for the first time, the Conference will be completely virtual, which is likely to enable a larger-than-usual participation from around the world). We’ll feature some nice examples of how self inner joins help pull out features and trends from comparisons, as well as many other capabilities of interest to our users.
We’d like to thank Rob & Gwyn for sharing their story, as well as for their work to create open-source, widely distributed queries, graphs, and tools. Their dedication to making transmission data accessible and consumable for the global amateur radio community is yet another testament to how technology and creativity combine to breed amazing solutions.
We’re always keen to feature new community projects and stories on our blog. If you have a story or project you’d like to share, reach out on Slack (@lacey butler), and we’ll go from there.
Additionally, if you’re looking for more ways to get involved and show your expertise, check out the Timescale Heroes program.