Skywave Project - First Phase
Published:
Skywave - Scheduling wireless communications under uncertainty
Over the past decade, my research group has repeatedly confronted the same obstacle in real‑time and distributed (RTD) systems: scheduling and analysis techniques degrade rapidly when the underlying assumptions about resource availability become uncertain. This is especially acute in wireless systems, where channel conditions can evolve in ways most schedulers struggle to react to, and where uncertainty accumulates through noise, interference, fading, and environmental dynamics. With financial support from ARDC to fund a dedicated PhD studentship, as well as the computing and communication infrastructure we obtained with the support of the University of Leeds, we launched the Skywave project, aiming to address this challenge directly.
The central premise is straightforward: use machine‑learning models, trained offline on large volumes of decoded message data, to provide quantitative estimates of transmission success likelihood, and integrate those estimates into RTD analysis and scheduling. This is not an incremental topic‑of‑the‑month experiment; it is a concerted attempt to push ML‑assisted real‑time systems into domains where classical analytic models have historically been inadequate. The Skywave project serves as the broader research umbrella and includes a case study focusing on the notoriously difficult exploration of HF propagation (i.e. wireless communications using the portion of the spectrum between 3-30 MHz), which enables low-latency, low-bandwidth global communications and is used in domains such as aircraft and maritime communications, amateur radio, military, and even low-latency trading in global financial markets. Unlike most channel-aware RTD analysis and scheduling, the Skywave approach will use models trained with message‑level data rather than physical ionospheric inputs, which will enable deployment within independent and low-cost platforms that have no way to access or measure ionospheric data, but will always be able to repurpose their HF radio to gather communication traffic data that can provide a better picture of HF channel conditions. The trade-off between scheduling communications under more uncertainty about the channel, versus waiting longer to better assess the channel quality by decoding traffic for a longer time window, is a type of problem that arises in many areas of wireless communications, so the methodology developed in the project - including the dataset capture, ML model training, scheduler design and evaluation - will be useful to other communication scenarios such as Internet-of-Things (e.g. LoRa and IEEE 802.15.4).
We have now completed the first phase of the project, which is to set up the computation and communication infrastructure to produce the dataset we’ll use to train the machine learning models with. First, I installed a multi-band HF antenna on the roof of one of the University of Leeds buildings, which was a complex endeavour by itself given all the building conservation and health-and-safety regulations, and the amount of electromagnetic noise generated by the building management, air‑conditioning, and wireless systems. Then, I set up a SDR transceiver that covers the entire HF spectrum and can be controlled by an external computer.


In parallel with that, I conducted a recruitment process to choose the PhD student that would be granted the ARDC-funded studentship and would do most of the research and development work in the project. After reviewing 35 applications from 13 different countries, and interviewing 10 candidates over a two-stage interview process, the interview panel decided to offer the scholarship to Miguel Boing, who accepted and joined the PhD programme a few months later after completing the university registration and student visa procedures. Right away, Miguel had to familiarise himself with the PhD process, with the SDR transceiver and its software infrastructure, and obtain an amateur radio license from Ofcom so that he could transmit using the SDR as part of our dataset creation process (as up to that point all the preliminary tests that resulted in the project proposal submitted to ARDC were made under my own license).
Over the course of his first year as a PhD student, Miguel managed to produce the typical literature review that every first year PhD student must go through to establish what’s the latest research in the area of their thesis, and to develop the software infrastructure that would automatically control the SDR (choose frequency, set transmission power, tune the antenna, control transmission and reception), gather channel traffic data before a transmission (by decoding signals received from the SDR) and also reception reports produced world-wide by other stations that could decode messages out of our transmissions (sent over the internet). For several months now, we have captured pre-transmission traffic and reception reports for our transmissions, which happen multiple times a day, at all hours, over four different parts of the HF spectrum (7 MHz, 14 MHz, 18 MHz and 21 MHz), and at three different levels of transmission power (1W, 10W and 25W). The figures below show on a map the reception reports we receive from stations all over the world that have received our transmissions over a 24h period. Different colours represent the different transmission frequencies, that have distinct propagation patterns and ranges. The different figures represent the reports received at different levels of transmission power.

We will continue to enhance the dataset throughout the project, but the data gathered over the past five months can already provide us with a good resource to apply our machine learning and data analytics skills. Besides Miguel and me, I have made the dataset available to undergraduate and post-graduate students doing their final-year projects, so they can study the correlations between data that can be obtained by the transmitter (e.g. frequency, transmission power, time of the day, decoded messages on the channel) and data that will not be available to a transmitter deployed in a remote area (i.e. no internet access, therefore no way to check if their messages were received). Over the coming months, we will have more insight about those correlations, and how we can exploit them to make predictions on transmission success. Later in the project, we will make the dataset available to everyone, along with the tools we have developed to control the SDR and gather data, so that academics and radio operators can try our approach with their own stations and communication setups.
Right now, we have also started the design and evaluation of a transmission scheduler that can benefit from the correlations, insights and predictions of the machine learning and data analytics models we are developing. We already have a simple scheduler, used for the dataset gathering exercise, but it works on a fixed periodic schedule and has plenty of computational resources to run as well as unlimited energy (from an electrical socket!), so it can keep retransmitting a message multiple times until it is eventually received. It has also several ways to know messages were received (e.g. via internet, or by receiving acknowledgements over the wireless channel). Our challenge is to produce a scheduler that can operate autonomously in a deployment that has limited computational power and limited energy (e.g. powered by solar power), and that can predict the success of its transmissions without acknowledgements over the air or the internet, as that would be the likely scenario in a remote deployment (e.g. in the middle of the ocean). Once we make more progress in that direction, I’ll write another blog post about it. In the meantime, here’s some more technical information about the overall project.
