Detecting Artificial Images with Vanishing Point Geometry

5mo Edited

Artificial Image Detection using Vanishing Point Geometry 🔍 Artificial images often break the geometric rules that real cameras follow. Even if they look similar at the pixel level, these geometric differences make them easy to separate from natural photos. One example is vanishing points. Real scenes usually have one main vanishing point because the whole image comes from a single camera perspective. But synthetic images often show many vanishing points spread out. This happens because generative models build scenes piece by piece and do not enforce one consistent perspective across the whole image.

98 Comments

Ivo Horvat 5mo

In AI generated content, I have observed inaccuracies in every aspect of their geometric construction. While they do not represent single or multiple vanishing point perspective matrices correctly, additionally AI generally does not represent the geometric accuracy of other matrices within an image, such as in lighting structure, scale consistency, lens distortion, or object consistency, and often color consistency across the value spectrum. For example, we often see objects and their cast shadows being different, reflections being unrelated to the reflector, floors of buildings being random different sizes, opposing sides of automobiles not being perfectly symmetrical, or most often of all, color saturation being inacurate across shading values. Some of this is of course harder to detect in purely organic subject matter.

Orlando Orozco 5mo

This is addresses one of my complaints with ai generated content. The other one being the lack of scale consistency in spaces, floor plans, and buildings ai visualizes. It is getting better (especially compared to early examples I experimented with a couple years ago) but this analaysis is helpful in quantifying exactly what is amiss.

Sabine VanderLinden 5mo

this geometric vulnerability you've pointed out really shows how generative models still struggle with fundamental spatial logic. it's fascinating that while these systems can create incredibly detailed textures and lighting, they're still tripping up on basic perspective rules that our eyes naturally expect to see unified across an image.

JAE-HONG E. 5mo

This is a good point to consider when identifying fake content. Among the convincingly generated AI content we see today, you can often spot cases where the single vanishing point perspective is broken. However, just as AI has resolved the awkward finger problem over time, it seems likely that AI will soon begin generating content that properly reflects single vanishing point perspective as well. 😬

8 Reactions

Marcel Gutsche 5mo

Some mentioned you could just use this as a constraint in your model, but I'm skeptical that would work. Vanishing point consistency is a global constraint, but diffusion models build iteratively from noise. You can't enforce it meaningfully until you have enough image coherence, but by then you're already committed to local decisions that might violate it. Computing it at every denoising step gets expensive, and the model wasn't designed to encode global geometric consistency anyway.

Nermine M. 5mo

Really interesting! Synthetic images often have trouble keeping a consistent perspective, which can show up as multiple vanishing points. It would be cool to try a machine learning approach to catch this. You could extract the lines, find where they intersect to get vanishing points, and use that information to help a classifier spot generated images.

Ahmet Başaran 5mo

Interesting idea, but it feels a bit like assuming “current AI models behave this way, so they always will.” Vanishing-point inconsistency isn’t some inherent flaw of generative models; it’s just a side effect of not forcing them to follow real camera geometry. If you condition the model on a single viewpoint or enforce projective constraints, producing perfectly consistent one-point-perspective synthetic images is trivial. And detecting VPs reliably isn’t guaranteed either — real photos with wide-angle lenses, distortion, or weak line structure will fail this test just as easily. So yes, this may catch today’s models, but tomorrow they’ll pass it effortlessly. Doesn’t look like a very future-proof detector. ... says chatgpt :)

Evelyn Qin Zhang 5mo

I’m skeptical about this as both evaluation criteria and training constraint, because real images and videos often have occlusions which prevent one from accurately estimating those geometric invariants

4 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

Mase Mark
5mo
Report this post
Artificial Image Detection using Vanishing Point Geometry. Artificial images often break the geometric rules that real cameras follow. Even if they look similar at the pixel level, these geometric differences make them easy to separate from natural photos. One example is vanishing points. Real scenes usually have one main vanishing point because the whole image comes from a single camera perspective. But synthetic images often show many vanishing points spread out. This happens because generative models build scenes piece by piece and do not enforce one consistent perspective across the whole image. #computervision #artificialintelligence
Like Comment
To view or add a comment, sign in
Sebastian Koch
5mo
Report this post
🚀 Unified Semantic Transformer for 3D Scene Understanding 🚀 How much 3D scene understanding can we get from just a handful of RGB images? Meet UNITE 📢 Given only a few views of a scene, UNITE jointly recovers 3D geometry and key scene semantics in one feed-forward pass: ✅ 3D Scene Geometry ✅ Semantic Segmentation ✅ Instance Segmentation ✅ Object Articulations ✅ Open-Vocabulary 3D Instance Search A key ingredient is multi-view semantic consistency: UNITE distills strong 2D foundation model signals and enforces view-invariant 3D features across correspondences, making semantics globally consistent in the 3D scene. 📚 Paper: https://lnkd.in/d-XdQ4EN 🖥️ Project: https://lnkd.in/dqQqrp3P Huge thanks to my amazing co-authors Johanna Wald, Hidenobu Matsuki, Pedro Hermosilla, Timo Ropinski, Federico Tombari 🙌
1 Comment
Like Comment
To view or add a comment, sign in
Kuber Mehta
5mo
Report this post
You probably don't know the Hidden camera controls in Nano Banana Pro. If you look through the API docs, you'll find parameters that aren't in the main UI You can control focal length and aperture values with mathematical precision if you use the JSON input mode Most people are stuck with the default medium shot look. Here, I've set the lens_focal_length to 85mm and f_stop to 1.2 in the code and see how the bokeh effect is physically accurate. Standard prompts just blur the background randomly, this calculates depth based on the subject distance. it changes the compression of the face completely. a 35mm lens makes the face look wider. an 85mm lens flattens it out. artists know this. photographers know this. but prompters are ignoring it. If you want your generations to look like cinema, you need to stop thinking in words and start thinking in optics. Follow me for cool AI tricks and applications most people miss
33 Comments
Like Comment
To view or add a comment, sign in
Fonix Geoscience

625 followers
4mo
Report this post
🗺️Georeferenced image overlays ➡️Door 22 shows how you can overlay georeferenced images on elevation data and 3D models in #LIME. You can create stunning visual effects with layer blending and swiping. Just what’s needed to bring your applications and 3D Stories to life❄️ #geospatial #digitalgeoscience #DEM #visualization

LIME advent calendar day 22
Like Comment
To view or add a comment, sign in
Arjun Singh
5mo
Report this post
Introducing SAM 3D, the newest addition to the SAM collection, bringing common sense 3D understanding of everyday images. SAM 3D includes two models: 🛋️ SAM 3D Objects for object and scene reconstruction 🧑🤝🧑 SAM 3D Body for human pose and shape estimation Both models achieve state-of-the-art performance transforming static 2D images into vivid, accurate reconstructions. #sam3d #meta #sam
Like Comment
To view or add a comment, sign in
Rohit Bhon
4mo
Report this post
Tech Motion no.2 | Laser Cleaning System For this exploration, I focused on Laser Cleaning, a process widely used in high-end manufacturing for everything from aerospace components to luxury watch restoration. The Creative Challenge: I wanted to emphasize the "peeling" effect of the laser as it strips away the outer layer to reveal the pristine surface beneath. The Workflow: Geometry: Created a dual-layer setup for the machine part. The "outer coating" was modeled with extra thickness, while the "inner core" remained thin and refined. The Reveal: I used C4D Mograph Effectors with Fields/Falloff to procedurally cut through the outer layer. The Intersection: To add realism, I synced a "ray of light" at the exact intersection where the falloff hits the geometry, simulating the high-heat laser reaction using curvature node in redshift. Tools: Cinema 4D, Redshift, After Effects. © 2025 Rohit Bhon. All rights reserved. #cinema4d #redshift #motiondesign #3danimation #industrialdesign

1 Comment
Like Comment
To view or add a comment, sign in
Michael J. Black
5mo
Report this post
Video diffusion models have strong implicit representations of 3D shape, material, and lighting, but controlling them with language is cumbersome, and control is critical for artists and animators. GenLit connects these implicit representations with a continuous 5D control signal describing the direction and intensity of a point light source. This enables single-image near-field relighting of an image using a video diffusion model. We use a ControlNet-like approach and show that, with a small amount of synthetic data, GenLit generalizes to complex real-world images. Given a single image and the 5D lighting signal, GenLit creates a video of a moving light source that is *inside* the scene. It moves around and behind scene objects, producing effects such as shading, cast shadows, secularities, and interreflections with a realism that is hard to obtain with traditional inverse rendering methods. GenLit shows that it is possible to get continuous control over implicit physical processes within a video model. I think this is just the beginning and promises to make such models much more practical for creators. Shrisha Bharadwaj will present today at SIGGRAPH Asia Room: S423/S424, Level 4 @ 13:50 on 15 of Dec. https://lnkd.in/gyGjRtX4

1 Comment
Like Comment
To view or add a comment, sign in
Francesco Ugolini
4mo
Report this post
Did you know ? Now when importing a 3D object, you don't get its material directly, but a Material Instance tied to a quite mysterious and inaccessible Parent material. So, to get our life back we'll talk about: ➕ The Auto-Import Trap: Why the engine thinks you can’t dress your own geometry. ➕ The Power Trip: Manually building a Master Material just to feel something again. ➕ The sRGB Ritual: Because if you don't check that box, the universe (and your textures) might actually collapse. It’s the same box, the same result, but with 100% more manual work to take control of that damned material again. 🔽Check out the full guide below and join us in the eternal cycle of shader compilation. https://lnkd.in/dPyuwDxH #GameDev #UnrealEngine #JerryTheAI #ShaderGrief
Like Comment
To view or add a comment, sign in
Matthias Niessner
5mo
Report this post
📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction 📢 We combine generative material priors with inverse path tracing: 1) define a parametric texture space 2) fuse monocular predictions across views into consistent textures 3) optimize low-dimensional parameters for physically-grounded reconstructions. The results are relightable PBR textures for 3D scenes: check out the result on a real-world 3D scan from the ScanNet++ dataset! 🌍https://lnkd.in/eYrd2e_k 🎥https://lnkd.in/epx65rn8 Great work by Peter Kocsis Lukas Höllein!

1 Comment
Like Comment
To view or add a comment, sign in
Peter Kocsis
5mo
Report this post
📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction We reconstruct clean and sharp relightable textures using inverse path tracing and monocular priors. Check out our project page for more results: https://lnkd.in/e3CvVaF7

Matthias Niessner

Professor at Technical University Munich
5mo

📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction 📢 We combine generative material priors with inverse path tracing: 1) define a parametric texture space 2) fuse monocular predictions across views into consistent textures 3) optimize low-dimensional parameters for physically-grounded reconstructions. The results are relightable PBR textures for 3D scenes: check out the result on a real-world 3D scan from the ScanNet++ dataset! 🌍https://lnkd.in/eYrd2e_k 🎥https://lnkd.in/epx65rn8 Great work by Peter Kocsis Lukas Höllein!
Like Comment
To view or add a comment, sign in

25,915 followers

1,009 Posts

View Profile Connect

LinkedIn respects your privacy

Detecting Artificial Images with Vanishing Point Geometry

Explore content categories

Detecting Artificial Images with Vanishing Point Geometry

More Relevant Posts

LIME advent calendar day 22

Explore content categories