So in this (last) episode of our quest to visualize sound we have to do some sound-vis experiment; right? Well, I did the experiment with an ultrasonic manipulator I was working on a couple months ago. I built a Z-Type shadowgraph (turns out I didn’t have enough space on the optical table for a knife edge because the camera I used is so enormous) with the following characteristics (see part II for nomenclature):
For this first experiment I used a phased array device with 64 ultrasonic transducers:
When I give all transducers the same phase, they are supposed to produce a plane-like wave. This is how the wave looks like in this shadowgraph setup:
Where the oscillation levels are plus and minus 50 levels from the background level. Thus, we have and the expected pressure fluctuation at 40kHz is 257 Pa, corresponding to a SPL of 142.2dB. Quite loud!
A measurement with a very good microphone (B&K 4939) gave me a pressure fluctuation of 230 Pa – which is quite close to the expected value from the shadowgraph!
I made a little video explaining the details of the setup and there’s some really nice footage of acoustic manipulators I built on it. Perhaps you can find some inspiration on it:
So this post was rather short. Most of the content is in the video, so I hope you have a look!
As discussed in Parts I and II, we established that we can use a Schlieren or a shadowgraph apparatus to visualize sound waves. The shadowgraph is not as interesting an instrument, due to its stronger high-pass behavior. Nevertheless, both instruments are viable as long as one makes it sensitive enough to see the waves.
When reading the previous posts, the keen reader might be weary of my isothermal assumption when applying the ideal gas law. Shouldn’t temperature also be a spatial wave? The consequence, obviously, would be that the waves become sharper with increasing loudness and easier to visualize as the non-linearity kicks in. The reader would be rather right, the non-linearity effect can be significant in some cases. But for reasonable sound visualization, the pressure fluctuation amplitude has to be above about 1 order of magnitude of the ambient pressure in order for this effect to change the wave enough. Let’s look into the math:
For a compressible flow, assuming isentropic sound waves, we have:
Where is the specific heat ratio for the gas under analysis. A pressure wave of the form then becomes a density wave of the form:
Taking the derivative:
Staring at this expression for long enough reveals we can separate the non-linearity term from the remaining terms:
The non-linear function represents the wave steepening as the amplitude becomes sufficiently high. Looking at the phasor in the denominator () we see the derivative operator split the time fluctuations from the spatial component of the fluctuation. For our detectability purposes, what really matters is the minimum and maximum value of to get a sense of a peak-to-peak amplitude. In this case, the phasor minimum amplitude happens at:
Similarly, the maximum happens when the phasor at the denominator minimum point:
Note this value can even be zero, at about 194dB SPL, in which case these equations blow up to infinity. Nevertheless, the difference gives us the peak-to-peak amplitude of the non-linear derivative of density:
Finding the minimum SPL as a function of wavelength is not as simple, since the non-linear function makes it difficult to isolate . However, we can do the opposite – isolate the frequency and plot the points with the axes flipped. To do so, let’s isolate the wavenumber:
The factor of 2 comes from the fact that the calculations performed in the previous section considered the amplitude of the sine wave, not its peak-to-peak value. Now we can replace the equations for the Schlieren sensitivity to obtain:
For the shadowgraph, a second derivative needs to be taken. But as shown in the chart below, the non-linearity only becomes relevant above about 180dB. In the case you get to these SPL values, you probably have a lot more to worry on the compressibility side of things. But it comes as good news that we can use a simple relation (like the one derived in Part I) to draw insight about sound vis.
This continues our saga we started at Part I (spoiler alert: you’re probably better off with a Schlieren). Thanks to a good discussion with my graduate school friend Serdar Seckin, I got curious about applying the same sensitivity criterion to a shadowgraph system. Turns out, Settles’ book also has the equations for contrast in the case of a shadowgraph (equation 6.10). It is even simpler, since the contrast is a purely geometrical relation:
Where is the distance between the “screen” (or camera focal plane, in case of a focused shadowgraph) and the Schlieren object. Since the ray deflection occurs in the two directions perpedicular to the optical axis ( and ), we get a contrast in the direction () and another in the direction (). The actual contrast seen in the image ends up being the sum of the two:
Since the ray deflection is dependent on the Schlieren object (as discussed in Part I), we can plug its definition to get:
Which is the “Laplacian” form of the Shadowgraph sensitivity. Laplacian between quotes, since it does not depend on the derivative. We can plug in the Gladstone-Dale relation and the ideal gas law to isolate the pressure Laplacian:
Replacing the traveling wave form , now obviously considering the two-dimensional wavenumber in the directions perpendicular to the optical axis:
Where is the squared wavenumber vector magnitude, projected in the plane. Applying the same criterion of levels for minimum contrast, we get the following relation for the minimum detectable pressure wave:
Which we can easily convert to minimum based on the wave frequency :
The boxed formula for shadowgraph is very similar to the one for Schlieren found in Part 1. By far, the most important point is the dependence on [Shadowgraph] instead of [Schlieren]. This is obvious, since Schlieren detects first derivatives of pressure, whereas Shadowgraph detects second derivatives, popping out an extra factor.
From a detector apparatus perspective, however, this is not of the most desirable features: The shadowgraph acts as a second-order high-pass spatial filter, highlighting high frequencies and killing low frequencies. The Schlieren has a weaker, first-order high pass filtering behavior. But this means they have a crossing point, a frequency where the sensitivity is the same for both. The apparatus parameters are different, however we can do the exercise for the rather realistic apparatus given in Part 1. If we make m (i.e., the focal plane is 0.5 meters from the Schlieren object), we get the following curve for a 12-bit camera:
If the news weren’t all that great for the Schlieren apparatus, the Shadowgraph is actually worse: The crossing point for this system occurs at kHz. There might be applications where these ultrasound frequencies are actually of interest, but I believe in most fluid dynamics problems the Schlieren apparatus is the to-go tool. Remembering that, even though I show SPL values above 170dB in this chart, the calculations considered a constant-temperature ideal gas, which is not the case anymore at those non-linear sound pressure levels.
With this, I’m declaring Schlieren a winner for sound-vis.
Tell me: Would you be excited to see sound? Well, I sure damn was when I first attempted to see the sound produced by an ultrasonic transducer! And still am, to be honest! So let’s learn how to do it with a Schlieren apparatus and the sensitivity considerations necessary in the design of a Schlieren optical system. For this, I’ll use my handy Schlieren and Shadowgraph techniques by no one else than the great Prof. Gary Settles. My PhD advisor, Prof. Alvi, was his PhD student – so I should get this right!
I’m a practical engineer, but I’ll derive the formulas in this section. If you want, you can skip ahead to the next section where I apply them and produce nice design charts.
If you would like more details on the physics, please refer to Settles. This is more of a derivation section. We’ll start with the Gladstone-Dale law, which relates the index of refraction of a gas to its density:
Where is the index of refraction, (for air) is the wavelength-dependent Gladstone-Dale constant and is the density of the gas.
Assuming a parallel light beam, the beam deflection of a uniform Schlieren object of length is given by:
Where is the direction of deflection. This deflection is magnified in the focal point of the optical apparatus mirror/lens close to the camera/screen; where the knife-edge is placed, according to:
Where is the change in the focal point location in the direction perpendicular to the knife edge and is the focal length of the second optical element (lens/mirror). Then, the contrast change observed in the camera/screen is given by Equation 3.6. in Settles’ book:
We can use the ideal gas law to combine the expressions as a function of :
We can then assume we are looking for sinusoidal sound waves of the form :
OK. Now let’s stop here for a while. I think it goes without saying you’ll probably need a high speed camera to see the sound waves, as they aren’t among the mundane things you can use your regular DSLR to observe. Given that, we can consider whichever camera you’ll use has some bit depth . When setting up your Schlieren apparatus, you’ll probably use half of the dynamic range of the camera for the “background” frame. This means, your intensity is about half of the dynamic range of the camera. In bit count, you’ll have as the background, undisturbed luminance level. Now, let’s assume you need at least levels to distinguish the signal from the noise. Maybe you have some de-noising method, like POD, that improves your SNR and your sensitivity. How loud do the sound waves need to be in order to produce a signal given you have a background luminance?
Let’s isolate :
Using our bit depth, we get the minimum pressure disturbance to be:
Converting to sound pressure level ():
Evaluating the and replacing :
A capital is used for frequency to eliminate confusion with , which in this context is the mirror focal length. If you know the background level, , then you might want to expose it in the formula instead of the bit depth of the camera pixels:
usage in design
The boxed formula is extremely useful for sound visualization design, as it allows us to define whether we can even see the sound waves we want to. First, a few remarks: I separated the bit depth because the appears in so many places. In this case, we have it with a negative sign, which is awesome, since in the end we want to minimize to maximize sensitivity. The first fraction inside the is a constant, dependent on the gas and the reference pressure. For air at 25ºC, .
Note that the minimum SPL is inversely proportional to the frequency . This means, higher frequencies are easier to see. Ultrasonic frequencies, in particular, are somewhat straightforward to see, as we will discuss. Let’s plot as a function of frequency for a typical large-scale Schlieren apparatus:
The black line represents the bit depth of most contemporary scientific cameras. As you can see, the news aren’t that great: For low frequencies, we have to scream really loud to see anything – and this is after blowing up the image such that we can only see 5 levels! For example, for a frequency of 5kHz we have about 128dB of minimum sound pressure level to be perceived by our camera. It’s not impossible, and with the advent of new data processing techniques like POD it is well feasible. But wouldn’t it be great to have 4 more bits per pixel (red dashed line)? That would bring the minimum SPL to 104dB, which would enable the visualization of a lot of phenomena (and even a rock concert!).
Well, I hope this helps you design your own Schlieren apparatus. Or maybe you lose hope altogether and quit – but at least this saved you time! If you want to design your own Schlieren setup to visualize sound waves, you can download the code that generated the chart above here. Anyhow, here’s a little video of how sound waves look like for a really loud (~140dB, ~5kHz) supersonic jet:
So you’ve probably already seen demos on Youtube showing this really weird “camera effect” where they stick a hose to a subwoofer and get the water to look like it’s being sucked back to the hose, seemingly against gravity.
I personally love this effect. In the case of the subwoofer, the effect is due to what is technically called “aliasing”. Aliasing is an effect important to all sorts of fields, from data analysis to telecommunications to image processing. In technical jargon, you get aliasing when you don’t satisfy the Nyquist criterion when sampling your signal. This might not be accessible to everyone, so I’ll explain it differently.
In the case of the hose stuck to the subwoofer, the speaker shakes the hose back and forth with a single frequency (a single tone) and generates a snaking/spiraling effect on the water stream. If the water stream is slow enough (that is, if its Reynolds number is low enough for the flow to be laminar), then no weird stuff (non-linear effects) occur, and we get a simple, single-toned spatial wave in the water jet. That being the case, the cycles repeat very nicely, becoming indistinguishable from each other. If you can fulfill this criterion, then aliasing also occurs in a nice manner, that is, if you happen to fail to satisfy the Nyquist criterion, you don’t get a jumbled mess but a nicely backward or forward motion that looks like it’s in slow motion.
It is a simple thing to do, but there’s some beautiful fluid dynamics on it. Generating repeatable patterns and laminar flows is not that simple, especially when you are engineering a device. If you attempt to reproduce the video linked, you’ll find yourself suffering through a parametric search of flow rate/shake amplitude until you get the right combination that displays a nice effect.
Here, I’ll discuss a different device, though – I’ll talk about the piddlers of Dr. Edgerton, that inspired awe in many people around the world – including myself. I have never personally seen one. But I understood what it was and that the effect was not just a camera artifact, but something that could be seen with the naked eye because they use a stroboscopic light to show the effect to the viewer. I have not – as of 2018 – found any instructions on how to make these. And since it turns out it’s quite simple, I think it should be popularized. Here’s my take:
the piddler design considerations no one talks about in practice
The water piddler is a device that generates a jet of water droplets. Water jets are naturally prone to breaking down into droplets – if you’re a man you know it! But on a more scientific tone: Water jets are subject to a fluid dynamic instability called the Rayleigh-Plateau instability. This document here is an incredible source that enables the prediction of what are the conditions for this unstable behavior without hassling you with all the complicated math behind. The Rayleigh-Plateau instability looks like shown below:
It’s beautiful – but it is also not really a single frequency. There seems to be some level of repeatability to it, but not enough to make the strobe light trick work. The reason for this non-repeatability is the following curve:
This is the dispersion relationship, extracted from equation (23) of Breslouer’s work. It corresponds to an axisymmetric disturbance in the radius of the jet – , where is a small disturbance in the radius and is the streamwise coordinate. gives us a wavenumber of the ripple in the jet, and gives us a frequency of this disturbance. The dispersion relation normalizes the wavenumber by the original radius of the stream, . When , we get an exponentially growing disturbance in the radius, which eventually makes the jet break down into droplets. So the black curve in the chart shows that any disturbance between will grow, but higher frequency disturbances will not – they simply oscillate. Disturbances closer to the peak at will grow faster, which is an important design guideline when we want to break down the jet into a stream of droplets.
The problem, though, is that any other frequencies around the peak also grow. The peak is somewhat smooth, so there will be a lot of space for non-uniformity, especially when the disturbances themselves start at different amplitudes.
So what would be a good design procedure? Well, first, we need to make sure the jet will be laminar. One way to guarantee that is to make the Reynolds number of the nozzle that makes the stream lower than 2000. That guarantees the pipe flow is laminar, which in turn makes the stream laminar. Of course, this is a little limiting to us because we can only work with small jet diameters. You can try to push this harder, since the flow inside the stream tends to relaminarize as the stream exits the nozzle due to the removal of the no-slip condition generated by the nozzle wall.
The other constraint has to do with reasonable frequencies for strobing. You don’t want to use too low of a strobe frequency, because that is rather unbearable to watch. Strobe frequencies must be above 30Hz to be reasonably acceptable, but they only become imperceptible to the human eye about 60Hz. We get a design chart, then:
The chart shows the growth rates (real part of ) for combinations of realistic jet diameters and velocities, which are the actual design variables. The line of constant Reynolds number looks like a 1/x curve in this space. The white line shows the upper limit for laminar pipe flow. You want to be under the white line, as well as in the growing region, which is about the region enclosed by the dashed line. For higher frequencies, the slope of the black boundary decreases, meaning you need smaller diameters to make the strobe light work. For lower frequencies, the slope increases, improving the available parameter space, but too low frequencies will be uncomfortable to watch. In case you want to develop your own piddler, a Matlab implementation to generate the colorful chart above is here.
It is actually rather remarkable that the parameter space looks like this, because feasible diameter/frequency combinations actually will break down into droplets if excited with 60Hz – the line frequency in the US. Say, for example, for 3mm jet diameter and 1m/s speed, we have a high growth rate and the piddler will produce a nice effect. At 6mm, 0.5m/s, we still have laminar flow but the instabilities won’t grow at 60Hz (lower frequency instabilities grow instead). Thus, you’ll not get a good piddler out of that combination. You might be able, for example, to push the bounds a bit (which I did) and make the jet diameter 4.75mm and the jet speed about 1.2m/s. In that case, the Reynolds number is about 3200, which still makes a reasonably repeatable piddler pattern.
Another thing you can attempt (I did) is to try to use a more viscous fluid. More viscous fluids will increase the viable diameter/velocity combinations where the Reynolds number is still low by pushing the white line up and to the right. For example, propyleneglycol allows us to approximately double the diameter of the pipe. The problem, obviously, is that it’s incredibly messy!
now to the real world
The design map is a good guideline to start this, but there are a few tricks that no one really describes in the internet. I’ll save you weeks of suffering: The best way to generate the disturbance is to use a coffee machine pump. Yes, a coffee machine pump! It accomplishes the two tasks for this device: Recirculating the fluid and generating a strong enough, single-frequency disturbance such that you don’t really need to trust the Rayleigh-Plateau instability alone to generate the droplets.
This is the basic implementation of the piddler I built. The coffee machine pump is a ULKA-branded one (like this one). I believe it doesn’t really matter which model you use, since mine had too much flow rate and I had to reduce the flow with the flow control valve indicated in the schematic. These pumps are vibratory pumps. They function by vibrating a magnetic piston back and forth with a big electromagnet. The piston pumps a small slug of fluid, that goes through a check valve inside the pump. When the piston returns, the suction shuts off the check valve, preventing back-flow. Suction fills the piston cavity of new fluid, and the cycle repeats.
Since the piston is pumping the fluid in tiny slugs at the correct frequency (assuming you have a viable Rayleigh-Plateau instability on your design), an acoustic wave will go through the water until the nozzle, generating the intended velocity disturbance in the mean flow. It will not be a choppy flow, but an oscillating one, due to the really strong surface tension forces in water. The figure in the left shows that there’s a full wave of instability before the stream breaks down into droplets.
Now that we discussed the design, let’s go for a little demo. In this video, I’ll also go through the z-Transform, which is a cool mathematical tool for modeling discrete-time control systems. I used this piddler as a prop for the lecture!
So I’ve been spending quite a bit of time thinking about vortex rings. Probably more than I should! I decided I wanted something that shot vortex rings filled up with smoke, but in a way that can last for very long periods of time. I came up with this idea that if I had an ultrasonic mist maker, I would be able to generate virtually endless fog that I could use for this. As a tease, this is what I came up with:
So hopefully you also found this cool! To be honest, I don’t know how this isn’t a thing yet – this could very well be a product. (i.e., it still performs the function as a room humidifier, but vortex rings are just cooler!). Ok, so how did I do it?
What’s a vortex ring?
A vortex ring is a (rather important) fluid mechanical structure. It is present in pretty much all realistic flows and plays an important role in turbulence. Generating one is simple: When fluid is squeezed through a round nozzle, it forms a temporary jet that “curls” around the edges of the nozzle. Since the nozzle is round in shape, the curling happens all around, like a ring of spinning fluid. If the “squeezing” stops, the curling continues, though, through inertia. One thing we learn in fluid mechanics is that a vortex (this curled fluid structure) induces a velocity everywhere in the flow field – i.e., it tries to spin everything around it. If the nozzle blows upward, the left-hand side vortex core induces an upward speed on the right-hand side. The same happens from the right-hand side vortex core, it also induces an upward speed on the left-hand side. It actually happens all around the circle, meaning the vortex ends up propelling itself upward.
If the flow of the vortex ring is somewhat laminar and we seed it with smoke, we can see the vortex ring propelling itself as a traveling ring (as in the video) because it persists for quite a long time. Eventually, it becomes unstable and stretches until it twists and crosses itself, rapidly breaking down to tiny vortices and spreading itself in a turbulent cloud.
How do I make one?
You need a means of generating smoke. Smoke machines used in stages / parties is generally the easiest way to get started. You fill a bucket with smoke, have a hole about 1/4 of the bucket diameter on one end, and then tap the opposite end. This replicates the “squeezing” process described before. It is not really an ideal solution, though, because the smoke fluid has to be replenished quite often. Plus, routing the smoke from the machine to this device that produces the smoke rings is not really easy (the smoke condenses in the walls of a pipe and forms a lot of oil in it).
So this idea struck me. If I use an ultrasonic fog generator (like this one), then I can produce ungodly amounts of smoke from a relatively small water tank. This smoke can last for hours and be stored in the water tank to increase its density. This is what I came up with:
A speaker is connected to a little water bucket (an ex-ice cream container) through this funky-looking black 3d-printed part. It’s just a duct that adapts the size of the speaker to the size of the orifice in the bucket. The bucket has about 120mm height, and the water level is about 70-100mm. The ultrasonic transducer is simply submerged in the bucket, generating tiny water droplets (a mist). The mist will mostly stay in the container, since gravity makes the droplets rain back to the water eventually. The tank lid has a nozzle, which is the only exit available for the air and the mist, once it is pushed by the speaker’s membrane. Thus, the speaker acts as a piston, an electromechanical actuator, and displaces air inside the bucket. In the forward stroke, it squeezes the air out, forming a vortex ring. In the return stroke, it draws the air back in. The waveform has to be asymmetric, such that the suction power is less than the blowing power. Otherwise, the vortex rings are sucked back into the nozzle, and though they do propel, a lot of their niceness is destroyed.
The figure above shows the best waveform shape I found for driving the speaker. It is quite important to get the waveform right! Even more importantly, it is crucial to DC-couple the driver. If you AC couple this waveform, it will not work at the low frequencies (i.e. 1-2Hz).It’s easy to test with a function generator, since the waveform is already DC coupled. In the end application, however, I ended up building a Class-D amplifier, without the output filter stage. The speaker itself removes the high frequency content due to its inductance.
I would share my design (mechanical drawings, etc). But this is such a custom-built device to fit a random ice cream container I found that there’s no point in doing that. I’m sure if you are determined enough to make this, you’ll find your way! A few tips:
Fog height between water level and top of the tank is somewhat important. The particular fog machine I used generates 30-50mm of fog height above the water level. If the fog is not to the top of the tank, when the speaker pumps the fluid out there will be no fog carried with it, which will result in an un-seeded vortex ring, ruining the visual effect. I found that the fog doesn’t overflow through the nozzle even when the tank lid is closed, even with a high water level.
The displaced volume is important. The larger the speaker size (I used a 4″ speaker with a 20mm nozzle), the less it has to displace to produce a nice vortex. A ratio between speaker diameter and nozzle diameter of 5 seemed to work well to me.
Remember, velocity is dx/dt. This means when you increase the frequency of the signal, the velocity increases linearly (2x frequency, 2x velocity). This means that, as you increase the frequency of the signal, you don’t need as much amplitude to generate the same exit velocity. Since exit velocity roughly determines the vortex circulation and, therefore, the vortex Reynolds number, you want to keep that number the same in your experiment. Say, if you double the frequency keeping the amplitude of the voltage signal, you’ll get twice the exit velocity, which will make the vortices shoot twice as fast (i.e., they’ll go further) and with twice the Reynolds number (i.e., they will become turbulent and break down earlier). There’s a balance to strike here.
Mathematics in the complex plane are sometimes surprisingly difficult to understand! Well, the complex numbers definitely earned their name! Maybe you’re also studying complex analysis, or have studied it in the past and didn’t quite understand it. The fact is, it requires a lot of imagination to see the concepts.
It’s basically a way to visualize how a given variable z maps into a function f(z). If f(z) is multi-valued (like, for example if f(z)=sqrt(z)), then the complex map “compresses” the entire z plane in a fraction of the f(z) plane. In the example f(z)=sqrt(z), as you probably already learned, this fraction is 1/2 (since the exponent is 1/2). The function f(z)=sqrt(z) is then said to have two branches. Functions that have this behavior will have a branch point, which is this point where as you go along 360º in a small circle around it, the function f(z) does not make a 360º arc. The function f(z) then becomes discontinuous, “it branches”. The discontinuity is actually along a curve that starts at the branch point, and this curve is called “a branch cut”. The branch cut, however, can be any curve starting at any angle. It just needs to start at the branch point.
The mind-numbing part starts to happen when the argument of the multi-valued function (for example, sqrt(z) and log(z) are the most commonly seen functions in math) is another function. Then, the branch points lie at the roots of the argument. If there are multiple complex roots, each one of them will be a branch point, from which a branch cut has to be made. Determining one of the branch cut curves, however, determines all of the remainder branch cuts.
Let this settle a bit. Let’s say you place a cut in the target function f(z) around the branch point as a line at an angle ϑ. z in polar form can be written as z=R exp(iα). The reason why branches occur in a complex function is because f(R exp(iα))≠f(R exp(iα+2πi)) , even though z=R exp(iα)=R exp(iα+2πi) (i.e., as you go around a full rotation, the point in the complex z plane is the same. But f(R exp(iα)) is uniquely defined. This means that the function f(z) itself determines how much angular displacement one full rotation about a branch point in z incurs in the f(z) plane. Therefore, defining the branch cut line automatically defines the region of f(z) that is available in the mapping z->f(z), which is “the branch” of f(z).
The app I developed computes that, for simple functions like sqrt(z+5). However, it is not general. It assumes that the branch cut starts at (0,0) in the f(z) plane. If the branch cut has to start at a different coordinate (say, for f(z)=sqrt(z)+1 it would have to start at (1,0) ), then the app mapping does not work correctly. It, nevertheless, gives some insight (in my opinion). Especially for a student! If you wanna develop it further, let me know!!
Consider the function f(z)=sqrt(z^2-4). This function has two branch points, i.e., where the argument of sqrt() is zero. These points are the roots of z^2-4, which are +2 and -2. The square root function maps the full z plane to a half plane, which is why in the app it appears like this:
On the right-half, we see the f(z) plane. The color represents the angle, i.e., arg(f(z)). We see the upper half plane in vivid colors, whereas the lower half is ‘dull’. The upper half is the current branch. The branch cut is defined, interactively, by the white solid line. The other branch cut line is, as discussed, automatically defined in the dashed line by the half-plane mapping the square root function gives. This upper-half plane branch cut, shown in the z plane, looks like two lines spanning out from the branch points (+2 and -2). They appear in the left-hand figure as a discontinuity in the color (which represents the angle of f(z)).
The cool thing is that, by dragging the little black dot in the right hand figure, we can move the branch cut interactively. Here’s a video of what happens if we go around the circle:
And another one going though other random functions:
I hope this helps you as much as it helped me understand these concepts! It’s actually quite cool once you visualize it!
As I discussed in this past post about MIDIJets, I was attempting to make a platform for surveying microjet actuator location and parameters in aerodynamic flows for my PhD research. But I think this is something that can be quite useful in many other contexts. After working with this for a couple months now and realizing how robust the driver I developed was (yes, I’m proud!), I decided to release this project as an open-source hardware. Maybe someone else might find this useful?!
With that said, the project files can be found at this GitHub page: https://github.com/3dfernando/Jexel-Driver . The files should be sufficient for you to both build your own board, program it with a PICKit4 (I’m pretty sure you should be fine with older PICKit versions) and communicate with the Serial port through a USB connection.
What can I do with it?
Now, let’s talk about the device’s uses. Being able to control many solenoids with a single board can be very useful. In my case, the application is aerodynamic research. We can activate or energize a boundary layer of a flow. But maybe the applications could transcend aerodynamic research? Imagine a haptic feedback glove that makes vibrating air jets on your fingers, how cool would that be? Or maybe an object manipulator by controlling where air is issuing from? I think there’s some other possibilities to be explored. If you would like to replicate this, let me know.
Visualizing the jets
Here’s some quick flow-vis showing the pulsating jets with a small phase delay of 60º. Just as a reminder, visualizing jets of 0.4mm diameter is not easy – so I apologize if the video looks noisy! There’s a dust particle floating in the air in some frames. That’s kinda distracting but is not part of the experiment!
Well, I’m a mechanical engineer, so board design is not really something I do professionally. Therefore, expect some issues or general weirdnesses with my design. If you’d like to replicate this, I used a Matrix Pneumatix DCX321.1E3.C224 solenoid. It is not a large valve. The right connector is on the project BOM. The issue is that this valve is a high voltage, low current valve (24V, 50mA). The driver shield I designed has those specs in mind. This means a different driver circuit would probably be needed for valves with different specs. Also, for higher currents, be mindful that the motherboard carries the current through it, possibly generating some noise if the driver current is too high (yes, I was not very smart in the board design!).
Well, I hope you found this mildly interesting. If you think you could use this project and you made something cool inspired by this, I would be pleased to know!
I, and also an increasingly larger population of the world, have concerned myself and dedicated countless hours to deliberating about the unfortunate fact of life that as we grow older, we eventually might not be able to provide for ourselves due to the natural degradation of our bodies. The capitalist society, however, provides us with the choice of converting our human capital (i.e., ability to work) into assets (money, stock, real estate) that can be used in the future to keep us going even when we lose the ability to work.
Therefore, it is important (and heavily underappreciated) to put aside a portion of your hard-earned capital for when those hard times come. Apparently, however, human psychology does not align very well with this rational argument. We naturally find ourselves jeopardizing these long-term goals by enjoying ourselves too much when we’re young and active, to the point of entering debt to buy the latest gadget.
You see, the consumerist culture of capitalism and the necessity for saving for the future are not mutually-exclusive behaviors and a reasonably intelligent and disciplined person should be able to consume goods and promote the advancement of society through the fostering of competition and the funding of technological research that is one of the greatest achievements of consumerist capitalism. It is a matter of fact, though, that this consumerist nature tempts the less rational part of ourselves to all sorts of dubious behaviors from the financial planning perspective. Thus, many different countries institutionalized retirement savings as mandatory through social security. The optimality of this solution is questionable, but under the terms of a “greater good” goal function it definitely is a sensible decision.
The collection of such a large pool of retirement capital under the management of a single countrywide institution has benefits and caveats that are rather important to be discussed. High levels of money management specialization should be expected from such an institution, given that the best in their field can be afforded to manage such a large asset. On the other hand, effectiveness of highly specialized money managers working with institutional capital against buying and holding the market has been demonstrated not to be corroborated by the data [as very well explained by Benjamin Felix, references in the video]. It is also reasonable that dilution of many behavioral and idiosyncratic market risks is possible with a larger pool of capital, averaging out the effects of spurious market movements. In third-world countries like the one I come from [Brazil], however, there is lesser trust in the effectiveness of the management of these funds as the transparency of social security data is low and a lot of room is available for dishonest behavior. I personally see social security in such environments as another “tax”, which does pay back in the long term but is prone to mismanagement and corruption.
Don’t retire early
With my stance in the argument set, I believe that regardless of social security one should save personal funds for discretionary retirement. You see, differently from people of the FIRE movement [Financial Independence, Retire Early], I believe careful selection of your professional career path during your 20’s should be enough to provide sufficient personal satisfaction from your job such that you wouldn’t need – or want – to retire early. If one’s job is fulfilling and provides them with a sense of contribution to society, why would they trade it off for “enjoying life” by doing absolutely nothing useful? Obviously, enjoying vacation trips now and then is important for a healthy life balance, but I’d say that would become boring rather fast if that was the only thing you did for a couple decades.
Granted you chose a fulfilling career, it is sensible to keep contributing to society for as long as we physically and mentally can. If you did not, consider changing while you can. Even if it is financially less rewarding, in the long run you’ll keep it up for much longer. And the fact that you enjoy what you do usually makes you willing to spend the time to do the “extra mile”, which is key to becoming respected in your area.
Nobel laureate Eugene Fama showed through his research the evidence that, in an efficient market, actively managing your money gives you no statistical edge against an investor that simply buys and holds the market. Fama also shows that there are some specific factors that have reasonable theoretical foundations and explain the gains of the market as a whole. Their description of the “Three-factor model” shows that regression fitting of historical stock pricing data can explain the performance of asset portfolios by three factors: The market factor would be a “premium” for investing in the higher-risk market; The size factor would be a premium for investing in higher-risk smaller capitalization stocks; and the value factor would be a premium for investing in companies that possess a higher book-to-market ratio. I confess I don’t fully understand the theoretical justification for increased average returns for the higher risk stocks. One thing that I find reasonable, in my personal ignorance of the financial market, is that the market factor justifies itself as long as we have large positive macroeconomic movements (i.e., as long as population grows, total amount of goods produced increases through more technology, etc.). I think it is a rather important limitation of Fama’s model that we need these macroeconomic movements to occur in order to have our stocks grow long term, and that major macroeconomic downturns are not impossible in the future if catastrophic events occur. Due to the unlikelihood of these events and the hopelessness of safeguarding from them even if they do occur, I still believe investing is a reasonable strategy.
Fama’s research sparked the composition of potentially one of the greatest tools for financial investing: Index funds. Though index funds have been around for several decades now, index plus other factor-related funds have popped into existence with incredibly small fees and the liquidity of a stock through exchange-trading. This allows small, individual investors, to decide for themselves their investment strategy and their risk tolerance in a DIY approach. If you’ve seen any other post from this blog, you know I love DIY!
the largest casino in the world
When I was 18 years old I had my first time experience with the stock market. After playing with it for a few months, I concluded – in my naive view of the world of then – that the stock market is just institutionalized gambling. The emotions you feel when your money fluctuates in the market are rather bewildering and I honestly experienced some real adrenaline pump while binge-watching my long positions fluctuating with the Market’s tide. It all seemed random, though. I tried looking for patterns, learned technical analysis and applied it as a guide to my investments. But after getting deeply acquainted with it, I felt like I was just finding patterns in randomness as we do when we see faces in clouds or stick figures made from stars in the sky. These patterns appeared to have the same predictive power of flipping a quarter. After that experience, I decided that I would not touch stocks ever again in my life.
Academic research really helped me to have a more sober view of the market. The outreach work by Ben Felix also helped me see through the bullshit of financial channels and blogs in the internet. I felt, after what was pretty much a decade, more prepared to give it another shot. The knowledge of statistics, scientific bias, data analysis and just plain critical thinking developed through higher education were instrumental in the establishment of my current, totally non-expert opinion of the financial market. So I decided to write this and share some of my humble data analysis results in the hopes that other people might find it “dumbed down” enough to give it a go. I still confess that some of the papers by Fama and French are still over my head due to sheer academic jargon and encoding.
As I glanced before, it is worth the exercise to ask ourselves why in the first place it makes sense to invest in the market. Why does the stock market seem to grow ever higher in value? Where is the wealth being generated? Is the market a zero-sum game? If so, who is losing money?
These questions still linger in my head, to be honest. We need to address what is a zero sum game, I think, to get started. A zero-sum game is just description for some systems where the total amount of a token is conserved such that only transfers of that token are possible between the players. This means, no “token” is being created out of nowhere. All games in casinos are zero-sum, for example. They involve the players putting their money in a pot, and the results of the game determine how much of that pot is distributed to the winners/losers. Usually in a casino, the game is such that the “house” has a slight statistical edge and will, over thousands of rounds, accumulate wealth. Since the game is zero-sum, that wealth must come from the players of the game. We have then, a very good distinction between “investing” and “gambling”. While both endeavors are risky and statistical in nature, “gambling” is a zero-sum game. “Investing”, on the other hand, is a positive-sum game.
But how is this even possible? How can one create money out of thin air? Well, surely the Federal Reserve in the US (and their equivalent in other countries) do, right? That makes the game positive-sum because now money has been created out of thin air. Well, not really. Though the total numerical amount of money might be larger due to “materialization” of money, no actual wealth was created by doing so.
This brings us to an important point in investing. What does money mean? What is the nature of wealth? Well, I don’t pretend to know the answer of these questions. My readings lead me to believe that money is a token that is institutionalized in our governments through thousands of years of iterations. It seems to be a natural manifestation of society. Instead of trading goods directly, we use the money token as a convenience. It by itself only stores value because everyone agrees it has value. Without diverting too much on why money has value, one can meditate that a way one can earn money, and therefore generate value, is through work. Careful application of one’s time and expertise to transform raw materials into more useful devices, goods or other consumables is a reasonable means of earning money. Let’s take the example of a material good, say, a chair. A chair stores value within itself because it is a useful device that allows humans to comfortably sit while they’re doing less involved activities or just enjoying themselves. It retains its value over time, because it keeps on accomplishing that task for a relatively long period until it finally decays to the point of becoming undesirable.
In the case of the chair, the people involved in the process of harvesting the naturally-ocurring materials to build it, cutting them into shapes that embody the function of the chair, and finally putting it together, need to be compensated for their time in doing so. Furthermore, the people involved in auxiliary services such as delivery, selling, handling, managing and others also will have spent a small fraction of their time in the particular chair you’re sitting while reading this, for which they also need to be compensated. Their time, therefore, is stored in the value of the chair. And you, when making its purchase, is willing to pay your earned money to have it. Of course, your function in society also does produce tangible or intangible goods in some sense, and your time is compensated such that you can afford to pay for it.
Through this reasoning I believe we can establish that goods and services store value and the production of such goods and services is how wealth (and therefore, money) is created. Some goods will last for longer, thus storing wealth for a longer time. Others, however, will last for very little time before spoiling (i.e., foods) or destroying themselves, thereby retaining their value for less time. This means that wealth is also destroyed over time, and in order to have a net positive wealth generation people need to be producing more value than the value that is naturally destroyed over time. I would say that a key requirement for this to happen is that populations keep on growing, because that would increase the overall demand for goods and services.
My current understanding based on this argument is that money is just an agreed-upon representation of people’s productive time. This representation is also useful to quantify the impact of one’s relative productiveness, since some people earn more money for the same amount of time invested in contributing for society. I’m not claiming that this is a fair representation, but the dynamics of market offer/demand should to at least some extent dictate the relative usefulness of people’s contributions. The efficiency of the job market is a point that I myself haven’t researched too much into, however. But in a sense, this is why it is somewhat accepted that there is some positive correlation between individual wealth production and their relative contribution to society (i.e., the dichotomy between highly regarded jobs such as doctors, engineers, etc.; versus lower-waged jobs like the exploited workers of fast food restaurants and supermarkets). But I think this is a highly controversial topic to be discussed here, because I don’t believe that people deliberately want to be useless in society.
So, HOW MUCH IS the errorbar?
Ok, this was a lot of meditation about capitalism. For personal financial decision-making, I’m sure none of that is necessary. What I really wanted to share, though, is my underwhelming observations of historical data. You see, if one believes index fund investing is a viable alternative for not only keeping their money value but also increasing it over time, then the evidence should point out to a mean effective growth of value over time, net of inflation effects, right? Well, though that has already been proven from numerous papers, I wanted to also give it a go. So let’s take the historical S&P500 index data as a benchmark for data analysis. However, the S&P500 index does not account for inflation. So the first step is to remove inflationary effects. If we do that, we get the following chart:
Interestingly, the chart indicates about 6 times growth in the index over the course of 90 years. As of the time of this writing, the US markets are regarded to be in a “bull run”, which obviously needs to be taken into account. But I’d say that everyone agrees that, on average, there is indeed an overall trend of growth even after inflation correction. For comparison, the first data point of the series in December of 1927 shows an index value of 17.66 before correction and 262.3 after corrected to 2019 money.
So there’s a mean growth. But when we are buying into stock, we generally do not know where exactly we are sitting in this curve. Maybe now we’re at a peak? Maybe not, maybe we are still on the rise and the next crash will be way past 4000 points. The point is, we don’t know a priori and we can’t know. Especially for us peasants that are not involved in finance, it is a waste of our valuable human asset and skill in other fields of knowledge to attempt to predict that. The practical question for non-specialists is more about whether, statistically speaking, there is an expected return (which seems to be the case from the figure above) but also, what is the amplitude of the other outcomes (i.e., good and bad). This is what I mean by putting an errorbar in your money. Every time you look at the stock market, the nominal value of your holdings is volatile – there’s some fluctuation, or noise, to it. The question I want to personally answer by analyzing the data here is, how much is that noise?
It is reasonable to expect this noise would be changing over time. Fluctuations on the daily basis should be small, but larger excursions should be expected over time, both for bull runs and bear runs. So this question only makes sense to me under a specified time horizon. We can analyze the historical data with different time horizons, then. If we look at, say, a week time horizon, we can look at any arbitrary pair of dates 7 days apart and see the return % distribution. Then we look at the return time series and average the returns, to get a mean return over a week. We can also look at statistical properties like standard deviation and percentile values, which would give us the size of that “errorbar”. So I’ve done that. The results over overlapping periods between 7 days and 40 years look like this:
Interesting observations can be made over the long term with the chart above: A mean trend of positive returns is expected over the course of 40 years. You should expect to triple your money (200% return), inflation corrected, over 40 years. Not as much as I hoped for, to be honest, but also not that bad. It gives, to me, a very good sense of how much saving money now will be worth when I retire. This also gives grounds for decision making, which is awesome!
Furthermore, observe in the chart above that in about 30 years some positive return is not only expected, but 99% guaranteed. It takes that long of a wait. This gives a good sense of the investment horizons we are talking about here.
Unfortunately a logarithmic scale can only be used in the time axis, as negative returns cannot be plotted in logarithmic scales. Therefore, the returns over periods less than an year are rather difficult to observe. So the chart below shows a zoomed version of the data, from 7 days to 1 year. We can see the growth of the “errorbar size”. Within a week, the standard deviation is 2.8% and the 1-99 percentile encompasses returns between -8.2% and +7.5%. In a month, the standard deviation grows to 5.8% and the 1-99 percentile now encompasses returns between -16% and +13%. Within an year, the 1-99 percentile grows to between -43% and +55%. Even though the mean of the returns is always positive (over an year it’s +4.54%) it is interesting to see that the distribution has a slightly larger negative tail. This shows how emotional responses tend to affect negative movements of the market in the short term more strongly.
Another interesting observation I made with this data is displayed in the animated GIF below. It is interesting to see how the probability distribution is pretty much a normal distribution for periods less than 4 months, losing its character as the periods grow longer. For an year the distribution is more triangle-shaped, and for over 3 years it starts to morph into a long positive tail. The fact that the distribution looks exactly like a random normal distribution for the short period (i.e., less than a quarter) demonstrates the amount of time that companies need to realize gains. It also delineates the change between stock market gambling over the short term versus actual generation of wealth over the long run.
This personal analysis for me is very captivating evidence that the stock market is a positive-sum game. I know this is limited to the U.S. market and that the political hegemony of the U.S. is probably biasing the results to a positive conclusion, which might not be true in the long run. Nevertheless, I believe for my short little lifespan it might still be of somewhat valid, empirical application. The statistical distribution of gains makes me more resilient to market downturns now, since I know what kinds of movements to expect in the short term, which unfortunately are rather large.
As a matter of practical mnemonic, I’d say 2 standard deviations is enough to capture the expected movement of the short-term market. This would mean weekly movements are expected to fall within about ±5%, monthly movements should be within ±10%, quarterly movements about ±20% and yearly movements about ±40%. It’s a large errorbar to put in your money, but one that has to be done if any positive expected return, inflation corrected, is desired.
I hope you also got inspired to look at the data yourself. If that’s the case, have a look at my code in Github. It is a simple code and I’ve done some simplifications for the sake of analysis. Nevertheless I think the conclusions are quite valid. Hope you’ve learned something!
As a disclaimer, I’m applying this technique in a scientific setting, but I’m sure the same exact problem arises when doing general macro photography. So, first, what is a Scheimpff…. plug?
Scheimpflug is actually the last name of Theodor Scheimpflug, who apparently described (not for the first time) a method for perspective correction in aerial photographs. This method is apparently called by many as “the Scheimpflug principle”, and is a fundamental tool in professional photography to adjust the tilt of the focal plane with respect to the camera sensor plane. It is especially critical in applications where the depth of field is very shallow, such as in macro photography.
As an experimental aerodynamicist, I like to think of myself as a professional photographer (and in many instances we are actually more well-equipped than most professional photographers in regards to technique, refinement and equipment, I reckon). One of the most obnoxious challenges that occurs time and again in wind tunnel photography is the adjustment of the Scheimpflug adapter, which is the theme of this article. God, it is a pain.
What is in focus?
First let’s start to define what is “being in focus”. It is not very straightforward because it involves a “fudge factor”, called the “circle of confusion”. The gif below, generated with the online web app “Ray Optics Simulator“, shows how this concept works. Imagine that the point source in the center of the image is the sharpest thing you can see in the field of view. It could be anything: The edge of the text written in a paper, the contrast between a leaf edge against the background in a tree, the edge of a hair, or in the case of experimental fluid dynamics, the image of a fog particle in the flow field. No matter what it is, it represents a point-like source of light and technically any object in the scene could be represented as a dense collection of point light sources.
If the lens (double arrows in the figure below) is ideal and its axis is mounted perpendicular to the camera sensor, the image of the point source will try to converge to a single point. If the point source and the lens are in the perfect distances to each other (following the lens equation), the size of the point is going to be as infinitesimal as the source, and the point image in the sensor will be mathematically sharp.
However, nothing is perfect in reality, which means we have to accept that the lens equation might not be perfectly satisfied for all the points in the subject, as that can only happen for an infinitesimally thin plane in the subject side. In the case the lens equation is not satisfied (i.e., as the dot moves in the subject side as shown in the animated gif), the image of the point source will look like a miniature image of the lens in the camera sensor plane. If the lens is a circle, then the image will look like a circle. This circle is the circle of confusion, i.e., the perfect point in the object side is “confused” by a circle in the image side.
The Aperture Effect
The presence of an aperture between the lens and the camera sensor change things a bit. The aperture cuts the light coming from the lens, effectively reducing the size of the circle of confusion. The animation below shows the circle of confusion being reduced in size when the aperture is closed. This allows the photographer to perform a trade off: If the circle of confusion is smaller, the image is acceptably sharp for a larger depth, increasing the depth of focus. But if the light is being cut off, then light is being lost and the image becomes darker, requiring more exposure or a more sensitive sensor. The markings on the side of the lens for different aperture openings (f/3.3, f/5, etc.) indicate the equivalent, or “effective” lens f-number used after the aperture was applied. Since the lens focal length cannot be changed, the equivalent lens is always smaller in diameter and therefore gathers less light. The shape of the “circle of confusion” usually also changes when using an aperture, as most irises are n-gons instead of circles. This effect is called “bokeh” and can be used in artistic photography.
Effect of the aperture on the circle of confusion.
Focusing on a Plane
Hopefully all of this makes more sense now. Now let’s make our example more complex and make two point sources, representing a line (or a plane) that we want to be in focus. We’ll start with the plane in focus, which means both points are at the same distance to the lens. Tilting the plane will make the circle of confusion of the plane edges grow (in the gif below, tilting the plane is represented by moving one of the points back and forth). This will result in a sharp edge on one side of the plane and a blurry edge on the other side.
The effect you get is usually seen in practice as the gradual blurring, as for example the image below shows. It becomes blurry because the circle of confusion is growing, but how much can it grow before we notice it? It depends how we define “noticing”. An “ultimate” reference size for the circle of confusion is the pixel size of the camera sensor. For example, the Nikon D5 (a mid-high level professional camera) has a pixel of around 6.45μm size. Cameras used in aerodynamics have pixels on that order (for example, a LaVision sCMOS camera has a 5.5μm pixel as of 2019). High speed cameras such as the Phantom v2012 will have much larger pixels (28μm) for enhanced light sensitivity. It makes sense to use the pixel size because that’s the sharpest the camera will detect. But in practice, unless you print in large format or you digitally zoom into the picture, it is very common to accept multiple pixels as the circle of confusion. With low-end commercial lenses, the effects of chromatic aberration far supersede the focus effect at the pixel level anyways. But bear in mind that if that is the case, your 35Mpx image might really be worth only 5Mpx or so. It is also generally undesirable to have only part of the image “Mathematically sharp” in a PIV experiment, since peak locking would happen only at a stripe of the image.
The Scheimpflug Principle
Well, this is the theory of sharpness, but how does the Scheimpflug principle help? Well, the next animation below attempts to show that. If you tilt the lens, the circles of confusion slowly grow to the same size, which means there would be a focal plane where they are the same exact size. I “cheated” a bit by changing the camera sensor size in the end, but in practice it is the camera that would be moving, not the object plane. This demo hopefully shows that there is a possible lens tilt angle that will bring everything in focus.
The Hinge Rule
Though I think much deeper explanations are available on the Internet (like on Wikipedia), I personally found that playing with the optical simulation makes more sense intuitively. Now we can try to understand what the Scheimpflug Hinge Rule is all about from a geometrical optics perspective.
The animation below defines two physical planes: The Lens Plane [LP], where the (thin) lens line lies; and the Sensor Plane [SP], where the camera sensor is placed. These planes, if the lens is tilted, will meet at a line (or a point, in the figure). This is the “hinge line”. The hinge line is important because it defines where the Focus Plane [FP] is guaranteed to go through. The hinge rule, however, would still be underdefined with only these planes.
The third reference line needed is defined by the Plane Parallel to Sensor at Lens Center [PSLC] and the Lens Front Focal Plane [LFFP]. The two lines are guaranteed to be parallel, and they define a plane – the Focus Plane [FP], where the point light sources are guaranteed to be in focus. A full proof of the Hinge Rule is readily available in Wikipedia and is not absolutely straightforward, so for our purposes it suffices to say that it works.
Lens Hinge vs Scheimpflug Hinge
Another confusing concept when setting up a Scheimpflug system is the fact that the Scheimpflug adaptor also usually possesses a hinge where it swivels about. That hinge line (the Lens Hinge) is not to be confused with the Scheimpflug Principle Hinge, explained before. But it does interfere when setting up a camera system because the Lens Hinge is the axis the lens is actually pivoting about, so it ends up changing the focal plane angle, where the camera is looking at, as well as the actual location of the focal plane. So I set up this little interactive Flash simulation here that determines the location of the plane of focus and allows you to understand the swivel movements I’m talking about. Here’s the link: http://www.fastswf.com/bHISKZA. There’s a little jitter for Scheimpflug angles close to zero due to “loss of significance” when performing the calculations, but it should be understandable.
Since most browsers aren’t very fond of letting Flash code run, you can also see a video of me focusing on an object plane (blue) below. In the animation, the camera/lens assembly swivels around the CH (Camera Hinge) axis and the lens swivels around the LH (Lens Hinge) axis. The Scheimflug Hinge (SH) is only used when performing the focusing movement of the camera. The focus optimization algorithm, however, is somewhat straightforward for a 2D (1 degree of freedom – 1 DOF) setup:
Look at the object plane: Swivel the camera hinge CH until the camera looks at the object.
Adjust lens focus: Turn the lens focus ring (effectively moving the lens back and forth) until at least some of the object is in focus.
Change the Scheimpflug adaptor: Increase/decrease the Scheimpflug angle by some (arbitrary) value. This will make the camera look away from the object plane.
Repeat the three steps as much as you need and you should converge to a good focus as shown in the video. Sometimes I skip a step because it is unnecessary (i.e. the object is already partially in focus).
And here are the effects of the individual movements when using the Scheimpflug adaptor:
But Where’s the Lens Plane?
This one threw me off for a while, so I expect not everyone would be familiar with this. Let’s say you’re trying to design a Scheimpflug system and you are using regular camera lenses (i.e., a Nikon/Canon lens). These lenses contain multiple elements, so it is not straightforward what is the definition of “focal length” that the lens is rated for, and most importantly, where this “effective lens” lies in physical space.
This reference and many others provide formulas for finding the effective focal length (EFL) or multiple lens arrangements. If the link dies, here’s the equation for a two-lens arrangement:
The effective focal length depends on the two lenses focal distance (f1 and f2) as well as in the distance between the two lenses (d). But most importantly, you can swap f1 and f2 (say, if you flipped the lens arrangement) and the EFL will remain the same. This is usually the case in multiple lens arrangements, and this is why most DSLR lenses will be rated for a single focal length, which is their effective focal length.
The EFL can be regarded as a means to replace the complex lens arrangement with a single thin lens. But where is that “effective lens” in physical space? Well, that is a rather difficult question because most lenses will still have an adjustment ring for their focal distance. So, let’s start with a lens focusing at infinity.
Focusing at infinity is the same as assuming parallel rays are incoming to the lens. This means these parallel rays will form a sharp point exactly at the lens focal point (by definition). Well, if a compound lens is set to focus at infinity (most lenses will have an adjustment where you can focus at infinity) then that point must lie on the camera sensor. Therefore, this thin lens must be exactly its focal distance from the image sensor of the camera. If now we know the camera’s Flange Focal Distance (FFD), then we know exactly where this “Effective Lens” is sitting at with respect to the camera flange, as shown in the drawing below. For example, this FFD is 46.5mm in a Nikon camera. A comprehensive list for many cameras is found here. Also, as a bonus, the Phantom v2012 high speed camera has FFD=45.8mm when using the factory standard Nikon F-mount adaptor flange.
If now we change the focus ring of our 50mm lens to focus, say, at 500 mm distance. Then we can use the thin lens formula:
And find that for o=500 mm and f=50 mm we get i=55.5 mm. Therefore, the thin lens moved 5.5 mm away from the sensor to focus at 500 mm instead of infinity. If you look carefully, a lens will move farther from the sensor as we bring the focus closer:
Good. So this means that if we want to do some fancier photography techniques (like using the Scheimpflug principle), we can now use the EFL and its relationship to the FFD to calculate our Scheimpflug adaptor and the Scheimpflug angle needed to focus at a particular feature. Remember, in most practical setups the Scheimpflug adaptor will act as a spacer, thus preventing the lens from focusing at infinity. The more space added, the closer this “far limit” gets and the harder it becomes to work with subjects placed far from the camera.
Scheimpflug Principle in 3D [2-DOF]
So this was all under the 2D assumption, where we only need to tilt the lens in order to get the plane in focus. Easy enough for explanations, but you don’t really find that case very often in practice. If the object plane is tilted in the other direction (in 3D) we’ll need to compensate for that angle, too. That can be done by “swinging” the lens tilt axis. In a tilt-swing adaptor, there are two degrees of freedom for the lens angle. The “tilt” degree of freedom allows the lens to tilt as previously described. The “swing” degree of freedom swivels the lens around the camera, changing the orientation of the focal plane with respect to the camera. A little stop-motion animation, below, shows how these two angles change the orientation of the lens on the camera:
Or, if you’re a fan of David Ghetta, you might be more inclined to like the following animation (use headphones for this one):
When doing it in practice, however, it is rather difficult to deal with the two degrees of freedom. In my experience, the causes for confusion are:
The object plane is static, and the camera is moving, but the movement is done with the lens first – this messes a little bit with the brain!
When you tilt the lens, you need to move the camera back to see the subject because now the lens is pointing away from the object plane;
It is rather hard to know if it is the tilt angle or the swing angle that needs adjustment in a fully 3D setup
It is hard to know if you overshot the tilt angle when the swing angle is wrong, but it’s also difficult to pinpoint which one is wrong.
This compounds to endless and painful hours (yes, hours) of adjustment in an experimental apparatus – especially if you’re not sure of what exactly you’re looking for. Different than most professional photographers, it is usual in Particle Image Velocimetry to have rather shallow depth of field because we want to zoom a lot (like, using a telephoto 180mm lens to look at something 500mm from the camera) and we need very small f numbers to have enough light to see anything. Usual DoF’s are less than 5mm and the camera angle is usually very large (at least 30º). But enough of the rant. Let’s get to the solution:
First we need to realize that most Scheimpflug adaptors have orthogonal tilt / swing angle adjustments. In other words, the tilt and swing angles define a spherical coordinate system uniquely. This means there is only one solution to the Scheimpflug problem that will place the plane of focus in the desired location. With that said, it would be great if the solution for one of the angles (i.e., the swing angle) could be found independently of the other angle, because that would reduce the problem to the 2D problem described before.
The good news are that, in most setups, that can be done. To find the correct location of the swing angle:
Get the normal vector of the target in-focus plane;
Get the normal vector of the camera sensor;
These two vectors form a plane. This is the “tilt plane”.
We need the lens to tilt in this plane. To do so, the lens tilt axis needs to be normal to the “tilt plane”.
Adjust the Scheimpflug swing such that the lens swivel axis is perpendicular to the “tilt plane”. That will be a “first guess” to the Scheimpflug swing. A solution is expected now, as you adjust the lens tilt. Or something very close to a solution, at least.
In practice there’s another complication related to the camera tripod swivel angle. If the axis the tripod is swiveling is not coincident with the axis of the “tilt plane”, then the problem is not 2D. That can be solved in most cases by aligning the camera again. But if that is not possible, usually it will require a few extra iterations on the “swing angle”, also.
Well, these definitions might be a little fuzzy in text. I prepared a little video where I go through this process in 2D [1-DOF] and 3D [2-DOF]. The video is available below.
Well, I hope these notes help you better understand the Scheimpflug adaptor and be more effective when doing adjustments in your photography endeavors. In practice it is almost an “art” to adjust these adaptors, so I think an algorithmic procedure always helps speeding up things. Especially because these devices are mostly a tool for a greater purpose, so we are not really willing to spend too much time on them anyways.