Blog Posts

Week 14: Final System Level Design Review Reflection

This week felt like the “finale chapter” of our VOXEL O2A journey, centered around our final System-Level Design Review (SLDR). The afternoon began with a networking session from 1–3 PM, where a liaison sat at our table and spoke candidly about what life looks like in the real world beyond the classroom. Instead of talking about our specific project, he focused on broader themes: how expectations change when you enter industry, the importance of reliability and consistency, and why skills like communication, accountability, and learning quickly on the job matter just as much as technical knowledge. Hearing about challenges such as handling ambiguity, working with diverse teams, and managing deadlines gave us a clearer picture of what it really means to be ready to work, not just technically strong.

From 3–5 PM, we shifted into our final SLDR presentation. As a team, we walked the audience through our full pipeline story – starting from a simple phone capture and moving through segmentation, 3D reconstruction, mesh and texture generation, and finally into Unity, where assets can be viewed, rotated, and inspected and also we had the exciting opportunity to demo our working prototype, showcasing our end-to-end pipeline that transforms standard video footage into 3D meshes using Gaussian Splatting. It was incredibly rewarding to see the pipeline in action successfully processing the “Gator” video from segmentation to mesh extraction and to receive validation that our progress is well ahead of the curve for this stage of the project.

Beyond the technical success, the highlight of the day was undoubtedly the human connection. We finally had the chance to meet our liaisons, in person after weeks of remote collaboration, which added a great dynamic to our discussions about future use cases and optimizations. The event was a fantastic experience overall. it was fun networking with new people, sharing our work with a broader audience, and celebrating the solid foundation we’ve built as we head into the next phase of development.

Tagged as: , ,

Week 13: System Level Design Review (PR SLDR) Feedback and Reflection

This week was all about our PR System-Level Design Review (SLDR) for the VOXEL O2A pipeline. We presented alongside four other IPPD teams, our coaches, and the TA, so it felt more like a mini conference than a regular class review. As a team, we walked everyone through how a simple phone capture travels through our pipeline from object isolation and 3D reconstruction all the way to Unity-ready assets. People responded well to the way we framed O2A as a complete, mobile first workflow rather than a single demo, and several reviewers specifically highlighted our clear module breakdown, error-handling strategy, and focus on turning research ideas into something a real user could actually touch and use.

The overall tone of the feedback was very encouraging. Reviewers felt we had a solid foundation and a realistic plan for where the system is headed next. The main suggestions were to simplify our architecture slide so that the core data flow pops out more clearly at a glance, and to spell out our testing plan in a more structured way what datasets we’ll use, which metrics (like PSNR, SSIM, runtime) we’ll report, and how we’ll compare against baselines such as NeRF and COLMAP. Both of these feel more like polishing than major rework, which boosted our confidence that we’re on the right track.

Looking ahead, we want to convert this momentum into concrete upgrades. We plan to redesign the architecture slide with cleaner visuals, then expand our SLDR document to clearly describe our evaluation pipeline, baselines, and Spring 2026 testing roadmap. In parallel, we’ll keep strengthening the technical side: stabilizing our mesh and texture pipeline, experimenting with Poisson/Ball Pivoting, running 3D Gaussian Splatting on benchmark datasets, and tightening our Unity-based visualization so people can freely rotate and inspect our reconstructed assets. Overall, this SLDR made VOXEL O2A feel less like a one-time prototype and more like a product-ready pipeline in progress, and that’s a direction we’re excited to keep building toward.

Tagged as: , ,

Week 12: Initial Testing on Mesh Generation

This week, our main focus was on mesh generation and understanding how to turn point-based or radiance field outputs into clean, usable 3D assets. We experimented with multiple mesh extraction methods, and spent time tuning parameters like density, smoothing, and decimation to see how they affect both visual quality and downstream usability. Instead of treating mesh as a “black box” step at the end of the pipeline, we dug into how surface reconstruction actually works and what kinds of inputs (noise level, point distribution, normal estimates) lead to stable outputs. This deeper understanding is already helping us recognize why some scenes reconstruct well while others collapse into artifacts or holes.

On the integration side, we started wiring these mesh outputs back into our VOXEL workflow so that meshes are not just generated in isolation but produced as a repeatable stage in the pipeline. We compared meshes across different methods for the same scene, checked how well they preserve fine details, and considered how they will behave once textured and visualized in tools like Unity. As we refine this stage, we can clearly see the full pipeline coming together: capture → segmentation → reconstruction → mesh → (soon) texturing and interactive viewing. At this point, we feel we are only two key steps away from an initial end-to-end pipeline by the end of the semester, and this week’s mesh work was a critical bridge between our earlier 3DGS experiments and a product-ready asset workflow.

Tagged as: ,

WEEK 11:Prototype Inspection Day Reflections and Feedback

Prototype Inspection Day

This week was all about Prototype Inspection Day (PID). Our team presented the VOXEL O2A pipeline to six judges and walked them through how we turn phone captured images into 3D assets. Two of us presented online while the rest of the team was in person, so it ended up being a mix of virtual and live interaction, but everything ran smoothly. The overall response was very positive. A few judges even mentioned that the project felt 99% there, which made us feel confident that our idea and pipeline are on the right track.

At the same time, the feedback clearly showed where we can improve. The judges wanted to interact with our results instead of only seeing static screenshots. They asked to rotate the 3D models, view them from different angles, and understand what the final user experience would look like. They also suggested that we show clearer comparisons against baseline methods like NeRF and COLMAP, include metrics such as PSNR, SSIM and runtime, and add more context about our segmentation and evaluation setup. We also realized we need to balance our presentation style. Some people felt parts were too technical, while others wanted more detail, so we plan to layer our explanations for different audiences.

Looking ahead to next week, we want to turn this feedback into concrete progress. On the technical side, we will keep moving toward a stable end to end mesh and texture pipeline, test mesh extraction methods like Poisson and Ball Pivoting in Meshlab, and see which ones give cleaner assets. We will also start running 3D Gaussian Splatting on the NeRF synthetic dataset and explore Unity based visualization so people can view and rotate our reconstructed meshes directly and compare quality and runtime. In parallel, we will begin drafting the SLDR and make sure it clearly explains our testing strategy, roadmap and baselines so VOXEL O2A feels less like a demo and more like a product ready workflow.

WEEK 10:Poster Presentation and Pipeline Evaluation

This week, our team successfully finalized and presented our poster at the UF AI Days Showcase, where we summarized our comparative study of NeRF variants. During the presentation, we highlighted the strengths and limitations of each model. For use cases requiring faster reconstruction, we recommended alternative NeRF variants optimized for speed while maintaining reasonable quality.

We are now successfully concluding Phase 1 of our project, which encompasses the end-to-end workflow from scene segmentation to NeRF-based 3D reconstruction. Testing and evaluation are actively in progress, and the initial results are highly promising. Our current system demonstrates strong structural fidelity and visual accuracy, consistently generating precise 3D reconstructions.

We have observed a notable improvement in both processing speed and model compactness compared to previously established methods. Specifically, the time required for both preparatory steps and core training has been significantly reduced. These strong outcomes strongly indicate that our core development, the adaptive densification and pruning strategy is highly effective. It successfully strikes a fine balance between computational efficiency and superior accuracy while ensuring the resulting model remains well optimized.

Overall, this week marks a major milestone as we transition from model development to full system integration and evaluation. In the next phase, we plan to extend testing to segmented, mobile-captured objects and explore alternative reconstruction techniques such as DUSt3R, aiming to further enhance both reconstruction quality and runtime performance.

Tagged as: , ,

Week 9: How Manual Input Strategies Impact 3D Object Segmentation Performance

In this study, we explored how different manual input strategies influence the performance of modern object segmentation models on a diverse 3D dataset of everyday household items. The focus was on understanding how varying levels of human guidance ranging from a single click at the object’s center to multiple points spread across its outline can affect the model’s ability to accurately identify object boundaries and maintain efficiency during segmentation. The evaluation used widely recognized measures of segmentation quality that assess the overlap between predicted and reference masks, offering a balanced understanding of accuracy and consistency without relying on raw numerical results.

The findings showed that denser and more deliberate manual inputs generally led to more precise segmentations, while minimal inputs provided faster but less detailed outcomes. Models that incorporated broader contextual cues from multiple points tended to capture object boundaries more faithfully, particularly for complex shapes. Conversely, simpler inputs were sufficient for well-defined or isolated objects, highlighting an interesting trade-off between effort and precision. These insights suggest that the way users interact with segmentation systems can meaningfully shape their effectiveness, encouraging future research into adaptive methods that balance speed, precision, and user experience in large-scale 3D understanding.


Key Technical Terms Explained

Hybrid Segmentation Pipeline: A system that combines different segmentation strategies or models to balance speed, precision, and adaptability for various applications.

3D Dataset: A structured collection of visual and spatial data where each object is represented in three dimensions, often including RGB images, depth information, and object masks.

Segmentation Model: A type of deep learning model designed to separate objects or regions within an image, identifying precise boundaries between them.

Manual Input Strategy: The method by which a user provides hints to guide a model’s segmentation process, such as clicking, drawing boxes, or marking points around an object.

Overlap-Based Metrics: Evaluation methods that compare predicted masks to ground-truth masks to measure how closely they match; examples include Intersection over Union and the Dice Coefficient.

Inference: The stage where a trained model makes predictions on new data. Faster inference indicates a more efficient model, whereas slower inference often correlates with more detailed processing.

Tagged as: , , , ,

Week 8: From PDR Success to Segmentation Breakthroughs

This has been an incredibly productive and rewarding week for Team Voxel! A major highlight was the successful completion of our Preliminary Design Review (PDR) presentation to the AGIS AI Team. We’re thrilled to report that the presentation went smoothly, and we received very positive feedback from the team. This interaction was invaluable, giving us fresh perspectives and solid validation for our approach moving forward. Beyond the PDR, we also took significant steps in preparing our system for critical testing, specifically by preparing a dataset for evaluating the manual input approach to object information in our Segmentation Module.

On the technical front, a huge amount of effort went into advancing our segmentation and 3D reconstruction capabilities this week. Our team successfully tested the Dust3R and COLMAP models on segmented objects and critically identified drawbacks for each, which is key to refining our methodology. We explored a range of segmentation techniques, including zero-shot, Vision AI, and foreground/background segmentation using both SAM and Mobile SAM. For our work with COLMAP, we integrated the YOLOv8s model to specifically segment and pass reflective objects into the pipeline to test mesh generation. These deep dives into different models and techniques are crucial for ensuring the robustness and accuracy of our final solution.

Tagged as: , ,

Week 7 Peer Review & Technical Progress

This week, our team participated in the Peer Design Review (PDR) session, where we presented our complete O2A pipeline to other IPPD teams. The feedback was very positive, many appreciated the technical depth of our project and the end-to-end structure of the pipeline. The discussion also gave us valuable insights and suggestions to refine our presentation for the official PDR next week.

On the technical side, we made exciting progress exploring new models for Structure-from-Motion (SfM). We tested the DUSt3R model and successfully visualized 3D point clouds, which was an incredible milestone seeing our captured objects reconstructed in 3D gave a real sense of the pipeline coming together.

We also began experimenting with a new method for object isolation using depth extraction techniques to separate the main object from the background. This step will be crucial for improving reconstruction quality and making the pipeline more robust to real-world scenes.

As we continue testing, we’re gaining a clearer understanding of the full pipeline flow, identifying edge cases, challenges, and opportunities for improvement. Each iteration is helping us fine-tune the process and move closer to a seamless, mobile-to-3D asset creation experience.

Tagged as: , ,

WEEK 6 – First Segmentation Trials and COLMAP Implementation

This week, Team VOXEL advanced two core parts of the O2A pipeline.

On segmentation, the team tested zero-shot approaches (MobileSAM and SAM2) on single-object sequences. Early findings showed that presenting multiple candidate masks and letting the user confirm with a single click improves reliability.

On reconstruction, the team ran an initial COLMAP → 3D Gaussian Splatting workflow from video frames. Results highlighted the need to apply foreground masks early, especially for challenging cases like transparent objects.

The high-level architecture diagram was updated to reflect the current flow from capture and preprocessing through segmentation, mapping, 3D construction, texturing, and Unity export.

Next, the team will continue comparing SAM variants, test 3D reconstruction under varied object/background conditions, explore alternative 3D paths such as Instant-NGP. With the PDR just around the corner, the upcoming weeks promise to bring sharper results, clearer comparisons, and the first polished demos of the O2A pipeline.

Tagged as: , , ,

WEEK5 – Finalizing Our Pipeline

This week marked an important milestone for Team VOXEL as we moved from planning into structured execution of the O2A (Object to Asset) project.

Our biggest accomplishment was finalizing the pipeline structure that will drive our mobile-to-3D asset creation framework. The pipeline includes segmentation, depth estimation, NeRF/iLRM-based 3D reconstruction, mesh cleanup and optimization, Unity integration, and final 3D asset export. By dividing these technologies among team members, we set ourselves up to progress in parallel, ensuring each part of the pipeline receives focused attention.

We also submitted our Preliminary Design Report (PDR), which documents the motivation, scope, customer needs, technical performance measures, and concept generation process for the project. The PDR captures our vision of democratizing 3D asset creation through smartphones, reducing the process from hours to minutes, and enabling user-generated content across VR/MR, gaming, sales, and education.

On the technical front, we began preparing small-scale tests for segmentation and depth estimation using tools like Mask2Former, SAM2, and ZoeDepth. At the same time, we refined our dataset assumptions and shared them with our liaison for feedback.

Looking ahead, our focus will shift toward implementing an initial proof-of-concept pipeline, expanding Unity testing with prototype assets, and preparing draft slides for the upcoming Preliminary Design Presentation. With the pipeline in place and the PDR submitted, our project remains on track, and we’re excited to see the first tangible results emerge in the weeks ahead.