Today I just wanna give a simple example using two pictures below of the same Saturn V rocket in Huntsville, Alabama.
The difference between the two pictures is on the left side. In Figure B, I kept some banners and building structures. The pictures are 2D and thus there is no way for us to sense the depth by looking at pictures. But those extra stuff on Figure B provides us the feeling of depth according to our experience. We have seen banners on poles along a long road or windows on a long building. So they help us get a feeling about depth and thus we can feel the length of Saturn V better.
If you mask the tandem lights on the left of the rocket, you will feel more difficult to sense the length of the rocket.
I have seen some research using single images (instead of stereo cameras taught in classical textbooks) for depth perception, like this one: http://robotics.stanford.edu/~ang/papers/aaai08-Make3dDepthPerceptionSingleImage.pdf