FDR was a blast! We got to present our work to our liaison at the private showcase, see other teams present, give our final presentation, network, and describe our project to the public.
Our project was successful since we accomplished the goals that MRSL tasked us with. That is, we built a code-free user interface pipeline to train and evaluate agents and we were able to prove that trained agents (specifically agents we trained) were able to outperform greedy/heuristic strategies.
IPPD has been an incredibly rewarding experience and we are all grateful to have been a part of the program and work with our wonderful sponsor, MRSL. Our code will go to space one day!!! 🚀
This week, the Orbiteers attended the peer review FDR presentation! We got valuable feedback from the coaches and students in the room. We will edit our presentation (visually and verbally) as needed.
Additionally, we submitted the polished draft of the FDR! It was a race to the finish to get the four volumes completed. We look forward to modifying it as needed after Dr. Grant’s review.
This week, the Orbiteers focused on refining the system and preparing for final delivery. We also re-recorded our project video, capturing a clearer look at CAPS in action and how everything comes together.
On the training side, the team began testing new PPO-based models and comparing them against our existing approaches. We also created new graphs to visualize how trained agents perform compared to simpler strategies, making it easier to see the improvements in decision-making.
Development efforts continued across both the backend and frontend. Updates were made to support training different model types more easily, and the launcher interface was improved with small usability features like tooltips to help explain inputs.
As we head into the final stretch, the focus is on polishing the system and completing deliverables, including final testing, retraining models, and wrapping up the Final Design Review materials.
This week, the Orbiteers presented our system at Prototype Inspection Day, marking an exciting step as CAPS continues to take shape. It was a great opportunity to show how the different pieces of the project are coming together and to get feedback moving into the final stretch.
On the training side, the team continued scaling Population-Based Training (PBT) across multiple machines. Recent testing confirmed that this approach not only identifies strong settings, but also helps models improve more effectively throughout training.
Frontend development also made solid progress, with new visualizations added to the CAPS Launcher. These updates allow users to explore results more interactively through dropdowns and preset plots, making it easier to understand model performance.
As we move into the final stretch, the team has also been focused on bringing everything together. Recent work has centered on merging changes across branches and keeping tests up to date to ensure the system remains stable as development continues.
Looking ahead, the team will begin extending training to additional model types while also starting work on the Final Design Report. With inspection behind us, the focus now shifts toward polishing and final delivery.
Returning from Spring Break, the Orbiteers brushed off the sand and shook the water out of our ears in preparation for the final stretch of IPPD. Beach vacations are temporary, but Earth observation is forever!
With renewed momentum, the team hit the ground running across several fronts: continuing to build toward a fully integrated and high-performing system. On the training side, a key milestone was reached in validating the viability of a head node for population-based training (PBT). This configuration allows the head node to serve a dual purpose as both coordinator and worker, ensuring that computational resources are fully utilized.
The benchmarking and integration efforts also saw significant progress this week. The CAPS Launcher codebase was successfully merged into the benchmarking branch, marking an important step toward connecting backend evaluation scripts with the frontend interface. This broke some unit tests, but the team is actively working to resolve these issues and stabilize the integration. Additionally, early work has begun on incorporating D3.js visualizations, allowing benchmarking results to be displayed dynamically within the frontend. These visual tools will provide clearer insights into model performance and make the application more interactive for users. The image below shows a peek into what we will develop.
Beyond development, the team also began preparing for final deliverables, including drafting a storyboard for the end-of-semester project video. This ensures that, alongside technical progress, we are effectively communicating the impact and functionality of our work.
We are preparing for PID next week and FDR by the end of April, so the development and administrative work will continue in parallel. It is so close, we can almost taste it!
In the final week before Spring Break, the Orbiteers made solid progress in the usual subteams of benchmarking and training, with new work on a user interface for our application ramping up.
From working on testing, plotting, and streamlining, the benchmarking team has stayed busy across the board. Many of these changes have been made to better understand the results we see from each of the benchmarked models, improving the team’s understanding of model progress and viability. Further, .csv files were converted into .pkl files, which will help speed up the process. Most notably, the first full benchmarking runs were completed with the trained agents, demonstrating a mixed performance report. The agents are performing better than greedy strategies in some aspects, but not in others.
To remedy this, the training team has been busy implementing better training strategies to ensure that all possible avenues are covered. The first test run of multi-node Population-Based Training (PBT) was completed, running two concurrent trials with two GPU workers attached using AWS and RayTune. This will allow for much wider and more efficient search spaces when tuning the agent’s hyperparameters. In the coming weeks, there should be a considerable increase in agent performance, as demonstrated in further benchmarking runs.
Some new work- the user interface will provide a landing page for operators to interact with the application. Several updates were made to the training and benchmarking tabs, and significant user control over the simulation was added. The image below provides a current look at this UI.
With next week being Spring Break, progress will move differently, and there will not be a blog post. Upon our return, we will be ready to hit the ground running, prepping for the final stretch into PID and FDR!
This week, the Orbiteers made meaningful strides across all three sub-teams on the CAPS project, with progress ranging from detection pipeline fixes to new scheduling strategies and image scoring.
On the object detection front, the GPU testing team tracked down a bug that was causing inconsistent results between two different ways of running the same detection model. After identifying and fixing the root cause, the two pipelines now produce nearly identical results, differing by less than 2%. The team will also work on packaging this detection pipeline into a self-contained container so it can be more easily deployed, targeting completion by March 12th.
The RL and benchmarking teams also had a productive week. A smarter training scheduler was added that can automatically adjust how multiple training runs are configured simultaneously, and early tests confirmed it’s working as intended. The benchmarking team also made improvements to CI/CD and wrapped up initial testing of three different satellite tasking strategies, including max priority, closest, and least slew strategies. The benchmarking team plans to dig deeper into additional strategies and begin benchmarking new models by early April. For the upcoming week, benchmarking will be working to visualize preliminary results data, and some progress has been made on this front, as seen below:
Overall, another productive week for the Orbiteers! The team continues to make measurable progress on all fronts, and the coming weeks are set up to build meaningfully on what’s been accomplished.
This week, the Orbiteers were able to build off of the feedback from the first QRB presentation and made the necessary adjustments in order to have a more successful QRB 2 presentation this past Tuesday. Furthermore, there was continued progress from all three sub-teams towards the CAPS project.
In regards to our second QRB presentation, our overall feedback appeared to be more positive than the previous QRB. One of the criticisms we had received from the first presentation was being able to clearly explain the problem we are attempting to solve with our project. From the feedback we received, it appears that this time our explanation of the problem was much clearer than it was in the first QRB presentation. This is a notable improvement, because clearly defining the problem for an uninitiated audience is vital for their understanding and our ability to receive valuable feedback. In addition, we also were more prepared to address possible questions or concerns from the judging panel, and this was reflected in our handling of the Q/A and clarification of misconceptions that have been brought up by panelists in prior presentations. The Orbiteers presented on Tuesday in the MAE B building on UF Campus, as pictured below:
For CAPS project progress this week, the GPU testing team was able collect valuable metrics on performing object detection using PyTorch vs TensorRT, collecting metrics such as precision, recall, and F1 in order to evaluate the strengths and weaknesses of each pipeline. The RL team was able to add evaluation episode functionality to the RayTune experiment runner, and also made adjustments to improve training robustness by adding slight jitter to OBS channels. The benchmarking team was able to complete a new script to put all necessary metrics into one .csv file, and another new script was created to run in the CI/CD pipeline in order to compare metrics outputted from the benchmark script.
Overall, a very productive week! Improvements were made and progress has continued, and now the Orbiteers will continue onwards into the next week.
This week we continued building momentum, with each subteam pushing forward and beginning to produce more measurable results.
The benchmarking team developed and tested new CI (continuous integration) scripts to run benchmarks across multiple target selection strategies and scenario combinations. This gives us clearer insight into execution times and overall system performance under varying conditions.
Meanwhile, the GPU testing team preprocessed and cleaned a satellite imagery dataset, enabling more reliable and consistent pipeline evaluation. Initial results across 500 images are promising: TensorRT achieved a 40% speedup in FP32 with only a 1.5% change in detections, and a 60% speedup in FP16 with just a 2.3% change. These results highlight a strong performance gain with minimal accuracy tradeoff.
On the reinforcement learning side, the team integrated Ray Tune to automate experiment sweeps with detailed logging. This has already enabled initial hyperparameter tuning results and sets the stage for faster iteration and more efficient optimization moving forward.
In parallel, Stefano worked with our liaisons to design early concepts for an updated simulator frontend. The proposed interface includes a landing page with clear entry points for running simulations, training, and benchmarking. A very simplified version of the diagrams he created are shown below.
Another productive week! We are looking forward to QRB2 next Tuesday, and to keep making progress.
This week, instead of our usual class, we had an IPPD project work day to keep our project moving forward. The team met over Zoom (see photo below) to collaborate across sub-teams and made great progress.
The GPU Testing Team created FP16 TensorRT engines that produce similar results to FP32 engines but run about 20% faster, giving us another performance boost for inference tasks. The Benchmarking Team added the ability to alter the satellite starting location in the orchestrator for scenario testing. They also finished integrating the lightweight runner into the benchmarking script, making it easier to test and evaluate other tasking methods. Additionally, they added a benchmark stage to the GitLab CI file, which can automatically run a test instance of the benchmarking script when merging code. Finally, the RL Training Team constructed a detailed training plan for phase-based model training sequences that can be run either sequentially or concurrently, giving the team flexibility in orchestrating experiments.
It was a productive week thanks to the IPPD project work day! Each sub-team is moving forward, and we’re excited to see these pieces come together as we continue to scale up.