Skip to content

Retrieval Augmented Generation for Retail: Solving 3D

A room photo became the starting point for personalized furniture recommendations that fit the actual space, style, and lighting conditions of the customer.

We built a retail AI system that could analyze a room from a single photo, infer layout and depth, retrieve relevant furniture from live catalogs, and render replacement pieces back into the scene with believable scale, perspective, and lighting. To make that possible, we combined computer vision, monocular depth estimation, multimodal retrieval, and custom 3D generation under a very aggressive timeline. The result gave Palazzo a practical RAG retail system for AI product discovery, 3D scene understanding AI, and photorealistic, shoppable room transformations.

We've also built this as a serious retrieval and spatial-computing problem, combining RAG-style retrieval, computer vision, and 3D scene understanding to make product discovery feel natural, visual, and commercially useful.

Your spark, our fuel©2025DreamersInc.The future is vast. What will you build?The future is vast.What will you build?
Your spark, our fuel©2025Dreamers is a research,data science, and techconsultancy thatspecializes in buildingsolutions for complexproblems.We are based in sunny California, but work remotely with clients fromall over the world. We’re engineer-owned and operated, meaning you’llget personal attention and quick answers without having tobushwhack through obfuscating layers of management.DreamersInc.The future is vast. What will you build?
Your spark, our fuel©2025DreamersInc.Palazzo[Case Study]
©2025The VisionImagine walking into a room, snapping a picture, andinstantly getting recommendations for better furniture.Not just generic suggestions, but real pieces from actualcatalogs—perfectly sized, styled, and placed in yourspace. The goal of this project was to bring thatexperience to life, using a blend of cutting-edge visualanalysis, smart recommendations, and photorealisticrendering.DreamersInc.Palazzo
Your spark, our fuel©2025The ChallengeCreating a system capable of this level of design intelligence is no trivial task. Itinvolves not just recognizing objects and spaces but truly understanding them—understanding depth, perspective, and scale while maintaining the style andaesthetic harmony of the room. Blending 2D catalog images into 3D environments,generating lifelike meshes, and ensuring realistic placement and lighting require aseamless integration of computer vision, machine learning, and 3D modeling. Thecomplexity of building such a system lies in the delicate balance betweentechnological precision and artistic presentation.To make this vision a reality, we had to build a system that couldVisualize the Space: Develop a model that could analyze and configurethe room layoutLeverage Real Catalogs: Create a robust database with data from leadingfurniture companiesMake Smart Suggestions: Implement a recommender module that not onlymatches style and color but also ensures a perfect fit in the roomTransform 2D to 3D: Generate 3D meshes from 2D images of catalog itemsUnderstanding 3D, Given a 2D Image: Monocular depth perception—understanding depth from a single image—is technically impossible. Humansrely on binocular vision, using the slight differences between two eyeperspectives to gauge depth. To overcome this, we used advanced machinelearning solutions to generate a reliable depth map from 2D images, enabling thesystem to understand spatial relationships in the roomPerfect the Fit: Assess the pose and scale of objects being replaced, then rotateand fit new items seamlesslyDeliver Realism: Blend and stitch the new objects into the room with aphotorealistic touch.DreamersInc.Palazzo
Your spark, our fuel©2025ObstaclesWe FacedRidiculously Tight Timeline: Solution: Work like maniacs. Stay laser-focusedand crank out resultsNo Hosting from Palazzo: Solution: We hosted the entire project on ourDreamers server, from development to production, avoiding delays anddelivering on timeSlow and Expensive 3D Mesh Tools: Solution: We built our own. Using state-of-the-art tools, we followed top repositories and connected with the brightestminds. We remained aware of the daily shifts in cutting-edge technology,maintaining close communication with the three top teams at the forefront of3D visual systems—including groups working on NERF, and specializedresearchers at UCSD, KAUST, and NVIDIA. Our in-house solution was not onlyfree and fast but also outperformed commercial options when the projectbeganEvolving Technology Landscape: The pace of change in this field is staggering.The tools available when we started were nearly obsolete by the time wefinished. We adapted continuously, evolving our approach as new research andtools emerged. Modern problems require this level of dynamic reaction to achanging field. Anything less risks delivering outdated solutions by the timethey're implementedMissing Room Dimensions: Originally, room dimensions were supposed to beprovided as part of the project scope. When it became clear they would not be,we built a system to estimate room dimensions accurately, keeping us on trackdespite a dynamic and evolving scopePerformance Bottlenecks: Solution: The model we needed for scale andorientation estimation was powerful but slow (300 seconds per run). Thatwasn’t going to fly. We brought in our high-performance computing expert andslashed processing time to 10 seconds.DreamersInc.Palazzo
Your spark, our fuel©2025DreamersInc.The process[Under the hood]
Your spark, our fuel©2025Why Depth Perception is Hard, Especially with A Single ImageDreamersInc.PalazzoWhen an image is captured using a single lens (monocular vision), it lacks the explicitdepth information humans perceive using two eyes and stereoscopic vision. So, to createnew 3D from an image, we must infer spatial relationships from shading, perspective, andobject size. Even then, there is inherent ambiguity in some situations, as demonstrated bythe gif below. This makes tasks like accurately replacing furniture and reconstructing 3Dscenes incredibly challenging.See for Yourself:view the depth ambiguity demoIn order to address this problem, we build a Depth Map of the space. A depth map isa representation where each pixel encodes depth information, typically using grayscalevalues—the darker the pixel, the closer the object; the lighter, the farther away. Depthmaps allow AI systems to estimate spatial relationships, helping to reconstruct a 3Dscene from flat images.Below is a depth map demonstrating how different objects in a scene are positionedrelative to the camera.
Your spark, our fuel©2025In addition to “masking” images, we want to know what each itemis, and draw a 3D Bounded Box around it, so that the computervision system can play around in 3 dimensions.DreamersInc.PalazzoTo successfully attack the furniture modification problem, we needed an Image Classifiercapable of knowing exactly what every item in the room is, and which pixels map to thatitem. This is known as classification and masking. Here is an example of Masking, in whichthe pixels map appropriately to what we want to replace in the room.
Your spark, our fuel©2025DreamersInc.PalazzoThe System in Action:Example 1:Prompt:Supplied Photo:Sofa selected by databaseSofa selected by database“Our kids playroom has a sofa that can really use an upgrade. I like the color, but I wantsomething more modern and clean looking, roughly the same size.”
Your spark, our fuel©2025System Solution:Selected sofa has beenSized, Aligned, and Blendedwith appropriate lighting tofit into the room.DreamersInc.Palazzo
Your spark, our fuel©2025DreamersInc.PalazzoExample 2:Prompt: “The L shape in my living area takes up a lot of room. Can we replace it with a sofa and a few otherpieces to let the space breathe? Pick colors that you feel go well with the paint.”Supplied Photo:Sofa selected by database
Your spark, our fuel©2025DreamersInc.PalazzoLooks Like a Couch in a Room - Isn't ThisProblem Easy?At first glance, replacing a couch in a room seemstrivial. With all our modern technology, why isn't thisjust a drag-and-drop process? The difficulty lies in thefact that the input image we receive has differentangles—both vertically and horizontally—than thereference catalog image. Even defining what theoriginal angle is, on its own, is not an easy problem.This means we can't just place the new furniture inthe scene; we need to generate a completely newimage that correctly matches the perspective of theroom. Further, lighting in the room rarely matches thelighting from the original product image. Shadows,reflections, and color tones all have to be adjusteddynamically, or else the replacement will feelunnatural. Additionally, if the new sofa is placed at anangle but its size doesn’t subtly decrease withdistance, the human brain instantly perceivessomething is very off, even without being able topinpoint what that is.The fact that these modifications appear seamlessis a significant achievement, not something thatworks automatically out of the box.
Your spark, our fuel©2025DreamersInc.Results
Your spark, our fuel©2025Despite the hurdles, we delivered afast, high-quality solution that has seta new benchmark for interior designautomation. The project exceededPalazzo’s expectations, combiningadvanced technology with creativeproblem-solving to turn an ambitiousvision into a functional reality. In turn,their clients, including householdnames such as Decorator’s Best,Ashley, and Arhaus, will now be ableto better present their furniture linesto the buying public.DreamersInc.Palazzo“I loved the founder'svision and technicalunderstanding.”For this and other rave reviews, youcan read more on Clutch:
Your spark, our fuel©2025DreamersInc.PalazzoThe way humans interface with computers is evolving rapidly. Search is dead.We’re moving beyond rigidly confined queries and into an era where weinteract with data systems using our own words, where natural languagebecomes the primary interface. Retail is transforming, driven by deeplypersonalized compute resources and intelligent systems that understandcontext and intent. We’re also starting to see hints of augmented realitycreeping into our visual spaces, blending the physical and digital worldsseamlessly. Dreamers Inc. is uniquely positioned to accept this challengeand push the boundaries.Conclusion
Your spark, our fuel©2025DreamersInc.The future is vast. What will you build?The future is vast.What will you build?