Using randomized snapshot POD to overcome SVD memory issues


For those acquainted with POD (Proper Orthogonal Decomposition), it’s very easy to throw POD at everything data analysis related. It’s such a powerful technique! For those who aren’t acquainted, I really recommend. I’ll introduce with a refresher. If you’re good, follow to the next section! We can decompose a set of N dimensional data points, which we will call here X, into three matrices. Since I’m not too fond of the “fancy mathematical jargon”, let’s just start with a practical example. I have a movie of particle image velocimetry (PIV), which is a data set that contains the information of a flow in a plane where I was looking at. This movie is a vector field, meaning it is a 2D set of data (like a screen), but each “pixel” has three numbers to describe the information contained (Velocity measured, in x, y and z direction). See the movie below and think this could very well be simulation data or any other data set!

Therefore, the movie is a 4D data set: Let’s count the dimensions: (1) Pixels in the X direction, (2) pixels in the Y direction, (3) vector component (x, y or z) and (4) time. Since POD can only handle normal matrices, we have to do some remodeling of the data set to fit matrix algebra, which can only handle 2D arrays of numbers. In order to do that, we need to flatten all dimensions but one. Which one should be left unflattened is up to you, but in “movies”, it makes a lot of sense to leave the time dimension unflattened, since we know that things are changing through time. Technically, you could do the POD in the spatial dimension instead, and perhaps get some insight from it, too, but for this example let’s stick to the more intuitive case where dimensions (1, 2, 3) are flattened into a column vector and dimension (4), the time dimension, is left unflattened.

If we do that, we can construct a data matrix X, which has dimensionality p\times q. p, the number of rows of X, is the number of entries necessary to build the flattened snapshot of the movie (in our case, dimensions 1, 2, 3 are flattened in the p direction, so p is the number of elements in each of those dimensions multiplied together). q is the number of time snapshots in the movie. This matrix can be decomposed into three matrices, which – without getting too much in the detail – are related to solving the eigenvalue problem of X. Since X is p\times q, we can’t technically solve an eigenvalue problem, but instead we can find the singular values (and corresponding vectors) of this matrix – hence POD is also called Singular Value Decomposition. These are as important as eigenvectors, in the sense that they form a set of basis vectors, othonormal in nature, that optimally span the space X. In lay words: The singular vectors of X are a set of nice pictures that represent the most variance in X, and the singular values are the square root of this variance – meaning if X has most of its content changing in a certain way, POD will capture that in its first few singular vectors/values.

POD is usually represented as this decomposition:

\displaystyle X_{p\times q} = U_{p\times q} S_{q\times q} V^T_{q\times q}

Where U_{p\times q} is the “mode matrix”, and each of its columns contains the flattened version of the mode – which can be unpacked by reversing the process of flattening used to generate the columns of X. S_{q\times q} is a diagonal matrix with the singular values, and the modern SVD algorithms (like svd() in Matlab) return S_{q\times q} in descending order (i.e., the first diagonal entry is the most important singular value). What we’re actually after here – for the current discussion, at least – is V^T_{q\times q}. This is the “right singular vector matrix”, and each of its rows contains a series of coefficients that, when multiplied by the corresponding mode column, gives the “movie” of only that mode playing. It, in a sense, is a “time series matrix” when the unflattened q dimension is time, and its coefficients can be analyzed to interpret what each mode is doing to the system under analysis. For example, you can try to take the Fourier Transform of a row of V to understand if there’s any periodic behavior related to the corresponding mode in U. It’s quite useful!

Some caveats and my two cents

Performing SVD in low-rank systems (i.e., when most of the energy in the eigenvalues of S is contained within the first few entries) can be quite insightful, but sometimes we need long time series to get converged statistics (say, for example in the case of Fourier Analysis of V, we need as many entries in V as possible). This is particularly the case in experiments, where noise can contaminate the spectrum and spectral averaging can “clean up” our data by a good amount. In general, Fourier analysis needs long data sets to capture periodicity with a good amount of certainty. Therefore, we would like to perform SVD in the whole X matrix so we can get the whole V matrix, right? Well, yes, but it is not uncommon to run out of memory! For example, doing SVD on one of our data sets where X was 100,000\times 10,000 flooded the 128GB memory of our data processing computer! SVD is, unfortunately, an algorithm that needs quite a bit of memory – more specifically, we need \mathcal{O}(pq^2) memory, where q is the shorter dimension. But if we assume (really, not a bad assumption in many cases) that the system is low-rank (say, only the first ~300 modes really matter), we could potentially perform SVD in the 100,000\times 300 matrix, which is quite doable even in a laptop because the memory complexity scales with q^2. The problem, is obviously, that now our V matrix has only 300\times 300 entries, which is not enough to do good Fourier analysis on. So what can we do to solve this?

Well, let’s remember that each column of U is a basis vector to describe the data set. Let’s say we compute a cheaper mode matrix like the big matrix U – let’s call it R_{p\times r} (where r is the number of reduced dimensions used, in our example r=300, whereas q=10,000). We can project every snapshot (column) of X back into R to get its entry in V. To see the math better, let’s show how it works in matrix algebra:

\displaystyle X_{p\times q} = U_{p\times q} S_{q\times q} V^T_{q\times q}

Each column entry of X is a flattened snapshot. Let’s look at how we generate one snapshot x_{p\times 1} from the SVD:

\displaystyle x_{p\times 1} = U_{p\times q} S_{q\times q} v^T_{q\times 1}

So we multiply the full mode matrix U_{p\times q} and singular value matrix S_{q\times q} by a column vector that corresponds to the contribution of each mode, v^T_{q\times 1}. Note the q dimensions all cancel out, so we can “stick” our reduced matrix R_{p\times r} and its corresponding singular vectors \mathcal{S}_{r\times r} in the mix, right? Let’s do that:

\displaystyle x_{p\times 1} = R_{p\times r} \mathcal{S}_{r\times r} \nu^T_{r\times 1}

Where \nu^T_{r\times 1} are the time coefficients of the particular snapshot we’re looking at (and in a sense, is what we’re after). Let’s do some math now – premultiply both sides by R^T_{r\times p}:

\displaystyle R^T_{r\times p} x_{p\times 1} = \underbrace{R^T_{r\times p} R_{p\times r}}_{I_{r\times r}} \mathcal{S}_{r\times r} \nu^T_{r\times 1}

We get an identity matrix because R_{p\times r} is orthonormal. So let’s pre-multiply everything by the inverse of \mathcal{S} once more to get rid of it:

\displaystyle \mathcal{S}^{-1}_{r\times r} R^T_{r\times p} x_{p\times 1} = \nu^T_{r\times 1}

Or, more neatly:

\displaystyle \boxed{\nu^T_{r\times 1} = \underbrace{\mathcal{S}^{-1}_{r\times r} R^T_{r\times p}}_{\textrm{Cheap SVD}} x_{p\times 1}}

This is actually quite neat. A column vector of the reduced rank right singular vector matrix \nu^T_{r\times 1} is obtainable from ANY corresponding snapshot x_{p\times 1} by projecting it onto the subspace of R_{p\times r}. This means we can obtain R_{p\times r} cheaply by simply randomly choosing a fraction of our data set to compose X, and then perform this projection to the remainder of the data set in order to get all the columns of V^T_{r\times q}, for the first modes at least. This would be a snapshot-by-snapshot process, without the memory overhead of SVD. Of course, you can use some more “mature” algorithms, like randomized SVD, but for me the issue is that you need to load the data matrix twice: Once to produce the randomized range finder, and another time to compute the randomized SVD. When you can’t even fit the data set in memory and the data transfer time is really long, you don’t want to duplicate the loading process of the data matrix X. This approach, although approximate, removes that need.

Of course, you don’t get the full benefit of SVD. In the end, R_{p\times r} is an approximation of U_{p\times q}, thus the projection will leave some stuff out. R_{p\times r} also is using less snapshots to be constructed, so you’ll have an error related to the modes themselves with respect to U_{p\times q}. But when lack of memory literally prevents you from doing SVD altogether and you’re most interested in analyzing V^T_{r\times q} (i.e., only the first r modes of V^T), then you can still do the analysis to a pretty decent extent. Also, it might be a compelling way of storing time-resolved, 3D computational results without all the hard drive storage necessary (like a compression). So let’s look at some really nasty, noise contaminated, experimental data sets:


Ok, so we’re looking at the wake of a NACA0020 airfoil that I used in a study I’m working on. The airfoil is at 0º angle of attack, but it still sheds coherent structures because of the shear layer instability. We want to pick up what are the frequencies of these coherent structures with Time-Resolved PIV, but this technique – even though it’s amazing – is a little tough to get to low noise levels. The individual vectors in the snapshots have uncertainties of the order of 10% of the mean flow! But as we will see, this is cake for POD – as long as there’s low-rank behavior happening.

So we construct the X matrix by randomly sampling vector fields from the full data set. I used randperm() in Matlab for this part. Since the computer I’m using can handle up to about 3500 snapshots, we will look at 3500 snapshots (the full SVD) versus 350 snapshots (the reduced SVD), and then project the full X matrix onto the subspace of R to get the time coefficients of the first 350 modes.

Comparison of the time series of the first POD mode of the wing at Re_D=60,000 (Re_c=47000); “Full POD” corresponding to 3500 snapshots; “Random POD” corresponding to 350 snapshots and the projection method described here

As shown in the figure here, there’s not really much difference between the time series of the reduced, “random” POD, versus the full-fledged POD with 3500 snapshots. The difference in SVD computation, however, is quite a big deal:

Data Matrix X sizeMemory Consumption (Matlab SVD ‘econ’)Computational Time
Full POD190872\times35005.7 GB252 s
Reduced POD190872\times350too small to measure (<100MB)7.1 s

SVD memory complexity for the case studied

As we can see, the difference is quite large in this “toy, but practical case”. Perhaps we could use more snapshots in the reduced POD case, but I also wanted to push it a little. The mode shapes look indistinguishable from each other, but of course we could get an euclidean distance metric (I didn’t really bother). Looking at the spectrum, we can see that the dynamics are pretty much indistinguishable in the first modes:

I’m not fully into the CFD business, but I think this simple technique could be used to compress CFD time series data, since we already know most flows do display low-rank behavior (and we are already doing these decompositions anyways to find this low-rank behavior!). In other words, I think one could: (1) run a DNS/LES solution until their statistics are converged (as is usual in the business), (2) run for some period of time while saving enough random snapshots to generate the snapshot matrix – you still need to run for a long-enough time, though – and then (3) perform POD on the snapshot matrix, and finally (4) run for as long as necessary, and project every snapshot onto this random POD basis to get the V matrix (low memory consumption). This enables one to save the V matrix, along with the first few modes of R onto a file, and in the future reconstruct the entire simulation time series (to reasonable accuracy) for further analysis. Obviously the analysis will be limited to the compression accuracy and the time span of the capture process, so one has to be aware of that!

Well, perhaps I’ll try that in the future. But for now, I’ll leave the idea floating so someone can perhaps find some use to it.

Thermal Stacks, Transmission Lines, high power LED’s and Temperature Sensitive Paints (TSP)

I’m writing this because even though it’s already 2020, this kind of stuff still cannot be found anywhere in the internet!

Stacking layers of materials is something we all do in so many engineering applications! Electronic components, batteries, constructions panels, ovens, refrigerators and so many other layered materials that are inevitably under heat transfer. We want to predict the temperatures on the stack, at all layers, to ensure the materials are going to withstand the transfer of heat. In order to do that, any engineer worth their salt will already have used the Thermal/Electrical Circuit Analogy. And it works wonders!

But what if we want to do transient analyses? What if the materials are going to be stressed by some quick burst of power; or some sequence of bursts at a certain frequency? Of course, we can always just do some FEM and solve our issue – but one might find that this can become a multi-scale problem if there’s low frequencies AND high frequencies involved.

This is commonly the case in semiconductors. Applications I’m familiar with are related to LED pulsing, but I’m sure one will find many others. Another more obscure application I found in my research has to do with unsteady measurements of surface temperature when fluid is flowing on a surface. Some frequencies will be too fast and the stack of materials will not respond, so how do we proceed in that scenario? How to predict the temperature rises and – more importantly – understand what we need to change in the design in order to improve its frequency response?

Well, it turns out we can extend the thermal circuit analogy to transients by including the component’s thermal capacitance. Most layers in a stack would then be easily lumped into RC low-pass filters. This is the “basic approach” and it works – to a point.

So let’s examine it, to understand its pros and also its limitations:

The thermal capacitor

Capacitance in electrical circuits is the ability of a given electrical component to store charge. In the thermal analogy, this “charge” is heat – and the thermal capacitance is just the mass of the component times its heat capacity (C=m \times c_p). If we look at the energy conservation equation for a lumped element:

\displaystyle q = m c_p \frac{\partial T}{\partial t}

Which means that, for a given heat flow q, a small capacitance C leads to a large rate of temperature rise (large dT/dt). I will make this problem more 1-dimensional to consider the case where the stack of layers is much thinner than wide, because that’ll lead to interesting observations when we drop the “lumped element” part of this model. For now, let’s simply divide both left-hand and right-hand sides by the sheet area A, to get everything in terms of heat fluxes:

\displaystyle \frac{q}{A} = q" = \frac{m}{A} c_p \frac{\partial T}{\partial t}

\displaystyle q" = \frac{\rho A L}{A} c_p \frac{\partial T}{\partial t}

\displaystyle q" = \rho c_p L \frac{\partial T}{\partial t}

The terms \rho c_p L then are what we will call the specific component’s capacitance (C^*), in the SI units of J/m2.K. One can then represent a layer of a stack like the following RC circuit:

Obviously, this single-layer model is rather simple and does not constitute of much of an issue. The RC circuit has a time constant, \tau=R^*C^*, where R^*=L/k. You can check the units: R^* has units of m2.K.s/J, which times the units of C^* gives us a time constant in seconds. The cut-off frequency is f_{cutoff}=1/\tau, and we get a -20dB/decade low-pass filtering behavior.

It is really worth observing, though, that the capacitor is not connected to T_2, but to an elusive “ground” temperature. This is because, as long as you have a non-zero \partial T/\partial t, you will have heat transfer to the thermal capacitance. Therefore, the ground level can be considered absolute zero temperature – for didactic purposes. The temperature of this ground level is not important, though: since capacitors always block DC, its absolute value will never play a role.

Stacking them up (plus all the ways you will mess this up)

For thermal stacks, it is not straightforward how to assemble the circuits together. Do we put the RC circuits in series with each other? As it turns out – not really. To understand why, let’s think about what happens in a two-layer stack:

Looking at the picture above, we have a stack of two layers and 3 different circuits that could (potentially) represent it. The circuit (a) represents how we would construct the thermal RC network using the same strategy as for the single layer explained before. (a.1) is simply another representation of the same circuit, perhaps less “bulky”. The circuits (b) and (c) are wrong because they do not represent the thermal equations that describe the stack. Let’s see why:

The first layer of the stack has an input heat flux of q_1", which according to the energy conservation equation applied to this lumped element, can be split into the heat flux that stays in the element (q_{c1}") and the heat flux that goes through the element (q_i"). This heat flux then goes to the second layer and splits again into q_{c2}" and q_2". The energy conservation equations here are equivalent to the Kirchoff current laws applied to the nodes at T_1 and T_i in the circuit representation (a):

\displaystyle q_1"=q_i" + q_{c1}"

\displaystyle q_i"=q_2" + q_{c2}"

This sounds easy because you already have the answer spoon-fed to you. But if you were left to your devices, I would bet many of you (including past me) would come up with the circuit representations (b) and (c). (b) doesn’t make any sense because capacitors block DC, therefore there would be no DC heat flux through the stack – total non-sense!

The circuit in (c) also does not make sense, but it is not clear why at a first glance. If you follow Kirchoff’s laws, however, you will see that (c) cannot be right, because in the node at T_i the heat flux equation would have to be q_1" + q_{c1}=q_i" + q_{c2}" – which is simply not the correct energy balance.

It is not really very intuitive that the capacitor of the first layer of the stack should be connected to ground, instead of T_i. In fact, I made quite a few times this mistake. But looking at the energy balance, we can’t really deny there is a correct equivalent circuit.

We can also solve the full heat pde with circuits!

Ok, so we tackled the single wall and the stacked double-wall problem. I think having gone through how this can go wrong, you can now build this to multi-walled systems easily. As you probably already realized, each new layer will behave as a new first-order system, and some very interesting behavior can appear as extra poles are introduced by the new layers. One thing, however, that is worth noting, is that this approach completely fails at high enough frequencies.

To understand why, let’s think about it a little further. If the temperature oscillations in T_1 are high frequency (i.e., their period is below the R^* C^* time constant), the lumped element model would predict a decay rate of -20dB/decade, which is what is expected from a standard first-order system.

But the problem in the (zero-dimensional) lumped element model is that the whole element’s temperature is assumed to be a constant value. At high enough frequencies, this won’t be the case anymore. We can see why the lumped element model becomes inaccurate by modeling the standard 1D heat equation over a slab:

\displaystyle -\frac{\partial T}{\partial x} = R' q"

\displaystyle -\frac{\partial q"}{\partial x} = C' \frac{\partial T}{\partial t}

The negative sign in the left-hand side is related to the direction of the coordinate system. One can recover the same equations by distributing the lumped element model related to each layer into the following circuit:

Where R'=1/k (thermal resistance per unit length) and C'=\rho c_p (thermal capacitance per unit length). This circuit immediately follows from the circuit presented in (a.1) in the previous figure; since one can consider a single layer can be decomposed into an infinite stack of infinitely thin layers. Not all mechanical engineers will recognize, however, that this corresponds to a (simple) version of the 1-dimensional transmission line model. Robertson and Gross (1958, “An Electrical-Analog Method for Transient Heat-Flow Analysis”, Journal of Research of the National Bureau of Standards) did recognize this in their paper, for example. Funnily enough, their figure shows they also got the RC network wrong, since the first resistor is shown between the first capacitor and T1 – an indicator of how tricky this business can be!

A thermal layer is actually a transmission line

I invite you to read the article on Wikipedia about transmission lines if you’re not familiar with this concept, or read the book “Time-Harmonic Electromagnetic Fields”, by Harrington, 1961. Transmission line is an electrical model for pairs of conductors at high frequencies. At high enough frequencies, the pair of conductor’s distributed capacitance and inductance creates waves inside the conductor, which can be modeled by the wave equation. The element of the electrical transmission line is displayed below (figure from the Wikipedia article):

The electrical transmission line follows the 1-dimensional damped wave equation, which is derived from the Telegrapher’s equations:

\displaystyle -\frac{\partial V}{\partial x} = L' \frac{\partial I}{\partial t} + R' I

\displaystyle -\frac{\partial I}{\partial x} = C' \frac{\partial V}{\partial t} + G' V

We can easily see this is the wave equation by taking the Fourier Transform of the two equations:

\displaystyle -\frac{\partial V}{\partial x} = (j\omega L'+ R') I

\displaystyle -\frac{\partial I}{\partial x} = (j\omega C'+ G') V

And then replacing, say, the x derivative of the first equation in the second:

\displaystyle -\frac{\partial^2 V}{\partial x^2} = (j\omega L'+ R') \frac{\partial I}{\partial x}

\displaystyle \frac{\partial I}{\partial x} = -\frac{1}{j\omega L'+ R'} \frac{\partial^2 V}{\partial x^2} = (j\omega C'+ G') V

\displaystyle \frac{\partial^2 V}{\partial x^2} - (j\omega C'+ G')(j\omega L'+ R') V = 0

\displaystyle \frac{\partial^2 V}{\partial x^2} + \omega^2 C' L' V - j\omega (L'G'+ C'R') V - R' G' V = 0

Taking the IFT:

\displaystyle \frac{\partial^2 V}{\partial x^2} - C' L' \frac{\partial^2 V}{\partial t^2} - (L'G'+ C'R') \frac{\partial V}{\partial t} - R' G' V = 0

Which, obviously, is the wave equation. It’s damped by the distributed resistance R' and conductance G'. If we come back to the thermal stack and consider there’s no conductance (G'=0) and no inductance (L'=0), we recover the heat equation:

\displaystyle \frac{\partial^2 V}{\partial x^2} - C'R' \frac{\partial V}{\partial t} = 0

Where the thermal diffusivity is \alpha=1/C'R' = k/\rho c_p ; as expected.

how this interpretation can be useful

Ok, by now you might be finding this all quite trivial and, perhaps, an over-complication of the heat transfer problem. Why would one need to invoke the far more complex “transmission line” to describe the behavior of such a simple system? Well, it basically allows us to easily determine the temperature transfer functions at different points in the stack (perhaps, easily is not the best word, but with a couple easily-programmable equations), without having to solve the differential equations numerically. Plus, it allows us a very cool insight about the behavior of the themal stack at higher frequencies. As it turns out, at high enough frequencies the first layer’s characteristic impedance – a material property – dominates the transfer function.

Thermal transmission lines in practice: the pulsed led

Let’s first analyze the following case that I find is the most practical for high frequency thermal stacks: An electronic power dissipation device that sources power in short bursts. I mentioned in a previous post that you can massively overdrive LEDs to produce a lot more light than “designed” for short bursts, which enables one to expose high speed cameras – like a camera flash. We’ll look at the Luminus Devices Phlatlight CBT-140 for this analysis. The reason why AC thermal analysis is important in pulsed LEDs is because, even though the mean temperatures of each component might be within spec due to the high thermal capacitance of the heat sink, the power burst will temporarily increase the temperature of the LED die, and we obviously also don’t want that to exceed the specs of the LED. Concerns about thermal expansion will not be addressed in this discussion – but you should also do your research on that!

The CBT-140 die looks like shown below. The named squares in the electrical analogy (bottom right) represent the “transmission lines” for each component.

We will consider the die is made of sapphire (many high power LED dies are made of it). This might be incorrect for this model – but I could not find any specs on that. The die has 0.31mm thickness; whereas the copper has 3.44mm thickness. The aluminium heat sink is pretty large. For the sake of this analysis we’ll consider 10mm. As we’ll detail later – it won’t matter much for the AC analysis.

So what we want to find is the “input impedance” as seen by the current source just above the die in the figure. The current source is a heat flux source, which in this case would be a square wave with some (low) duty cycle, since power is generated by the (micrometer thin) LED just above the sapphire die. Because there’s a cascade of 3 “coupled transmission lines”, we have to use the transmission line input impedance formula three times. From Wikipedia:

\displaystyle Z_a = Z_{0,a}\frac{1+\Gamma_L e^{-2\gamma l_a}}{1-\Gamma_L e^{-2\gamma l_a}}

Where Z_a is the input impedance as seen by the node at T_a and \Gamma_L=(Z_L-Z_0)/(Z_L+Z_0) is the reflection coefficient due to the impedance mismatch of the aluminium slab and the convective boundary (also the load impedance) R_{conv}=Z_L. Z_0 in this case is the aluminium’s characteristic impedance, which in general has the form:

\displaystyle Z_0 = \sqrt{\frac{R'}{j\omega C'}} = \sqrt{\frac{1}{j\omega (k \rho c_p)}}

Note the characteristic impedance is inversely proportional to the square root of the frequency. This will explain the “half-order” decay (-10dB/decade) behavior in the frequency domain we’ll see later (in contrast with a first-order expected from these RC networks). The square root of 1/j predicts a phase lag of 45º at high frequencies (not 90º!). Also referring to the Wikipedia article, the propagation constant \gamma is a material property:

\displaystyle \gamma = \sqrt{j\omega R' C'} = \sqrt{j\omega \alpha^{-1}}

As we discussed, the math is not the “prettiest”, but this can all easily be hidden behind an Excel or Matlab formula. In this case, using the properties of the three materials shown in the table below and a convection coefficient of h=100 W/m^2 K \rightarrow R_{conv}=0.01 m^2 K/W, we get:

Aluminium 6061CopperSapphire
k180 W/m K287 W/m K34.6 W/m K
c_p1256 J/Kg K376 J/Kg K648 J/Kg K
\rho2710 Kg/m38800 Kg/m33930 Kg/m3
Z_04.04\times 10^{-5} \sqrt{-j\omega^{-1}}3.24\times 10^{-5} \sqrt{-j\omega^{-1}}1.06\times 10^{-4} \sqrt{-j\omega^{-1}}
\gamma137.5 \sqrt{j\omega}107.4 \sqrt{j\omega}271.3 \sqrt{j\omega}

In order to find the “input impedance” at the node T_c, we use the Z_a as the input load (Z_L), and then iterate. Since we’re looking for the maximum temperature in the stack, T_d, let’s calculate the overall input impedance of the stack Z_d:

In this case we are driving the LED with a 500 nanosecond square pulse that repeats at 100 kHz. The drive current of the LED is 200A, and given the die area of 14 mm2 and a voltage drop of 20V in the LED, we have an instantaneous power of 285 MW/m2. That’s a lot of power!! Well, as it turns out, the mean flux of power (in this case, at a duty cycle of 5%, 14.2 MW/m2) appears in the spectrum at the DC component, but the sharp bursts of power also generate incredibly high amounts of high frequency power flux that really need to be considered.

Looking at the figure above, the solid, thick black line is the actual frequency transfer function of the whole thermal stack. The dashed blue line represents the lumped element assumption and it is a good model until about a frequency of 0.05 Hz. It then starts to fail when the aluminium starts to kill the waves very close to its surface due to its “large characteristc impedance”, (Z_0). You can think about it like; its large capacitance combined with its (relatively) low conductivity prevents the thermal oscillations on one side of the material to immediately propagate to the opposite side – invalidating the lumped element assumption.

This issue happens once again for the copper (not clear exactly where) and then finally for the die itself (the sapphire) above the frequency of 40 Hz. Then the characteristic impedance of the sapphire dominates the transfer function – i.e., the sapphire is thick enough to be effectively “infinite”, as far as high frequency heat transfer is concerned (the sapphire is 0.31mm thick!!). The temperature waves then decay inside the sapphire, and very little oscillations can be detected at its opposite side. At those frequencies, it doesn’t matter anymore which material you put at the interface. It will not see any temperature oscillations.

All of this insight we get without needing to numerically solve the heat equation is quite fascinating. We now know that at high frequencies, the dominant part of the frequency response is a material property – meaning that a material choice here is crucial for high frequency heat transfer!

How important? Well, let’s run the numbers and see what the wave looks like. I removed the DC part of the wave, because this LED really needs some thermal mass attached to it (the aluminium block) as no amount of convection or heat-sinking will get rid of this insane amount of power. Therefore, let’s only focus on the oscillatory part of the wave:

We can see that we predict an overtemperature of almost 25 K for the die surface for this case. This is on top of the die heating up from the constant heat flux at DC. It is quite a lot and could (in fact, DOES) easily lead to die failure if one only reads the copper temperature. Therefore, one needs to account for this overtemperature when designing an emergency shutoff system – otherwise, it’ll only think about shutting off 25 degrees after the LED failed!

Temperature sensitive paint – high frequency measurements

So this one is quite an interesting application from the research standpoint! We use temperature sensitive paints to perform measurements in aerodynamic models. We paint an aerodynamic surface with this special paint, and it fluoresces under UV illumination, emitting a strong (usually red) light signal at colder temperatures and a weaker signal as the paint surface becomes hotter. An example of supplier of these paints is given here. Though this is quite a specialized application used mostly in high end aerodynamics research, there’s still a lot of interest into understanding its behavior.

So let’s say we want to perform unsteady measurements of heat transfer coefficient at the surface. We put the surface under a flow of nominally known temperature (for this example, a jet impinging on a surface) and then we use the TSP technique to image the temperature map over the impinging surface; getting a full temperature map. As the flow oscillates, we have fluctuations of this convection coefficient – which translate to measured fluctuations in temperature. Obviously, we might not be able to see them if the temperature does not fluctuate much because the substrate of the surface has too much thermal capacitance (as discussed throughout this article). In fact, now we know we have waves of temperature going through the material, which will be rapidly decaying. It might perhaps be the case that the waves are so weak that we cannot detect them. Therefore, we need to assess whether we can measure these waves by comparing the expected signal generated by the oscillations with the dynamic range of our camera.

Let’s start with a rather simplistic setup – to illustrate the issue. Say we have an impinging jet as in the figure below. The impinging jet is the flowing arrows at T_{\infty}.

A “thin” layer of temperature sensitive paint is painted on top of a 10mm thick aluminium plate. The layer of paint, although thin, will have an effect on the transfer function. Here’s what happens as we change the paint thickness. The solid black line represents a reasonable paint thickness (5 microns).

As we can see, the transfer function is highly dependent on the thickness of the paint: If the paint is too thin, we get dominance of the aluminium block on the transfer function, which reduces the signal gain very significantly at higher frequencies. As the paint layer thickens, it starts dominating the transfer function – but knowing the paint thickness might be a challenge.

A solution to this issue is to use an insulating film of similar properties to the paint binder (in this case, the binder used for the calculations was a Polyacrylic Acid polymer). Most importantly, if the two materials have similar characteristic impedances (Z_0), then the transfer function obtained will not be as dependent on the paint thickness.

Polyacrylic acid has the following thermal properties (sources here and here). The film considered here is a PET film (this one).

Polyacrylic acidPET Film
k0.543 W/m K0.21 W/m K
c_p1248 J/Kg K1465 J/Kg K
\rho1440 Kg/m31170 Kg/m3
Z_01.01\times 10^{-3} \sqrt{-j\omega^{-1}}1.67\times 10^{-3} \sqrt{-j\omega^{-1}}
\gamma1819 \sqrt{j\omega}2856 \sqrt{j\omega}

As we can see, the thermal impedance of the two materials is somewhat similar. This means less reflection will occur at the interface; which will help with a reduced dependence of the transfer function on the paint coat thickness. The circuit analyzed becomes the following:

This graph shows a great improvement; because the transfer function at higher frequencies is less dependent in the paint thickness. Not perfect – this would only be possible with a perfect match of thermal impedances. Nevertheless, this can definitely be used to make some measurements.

As a remark – the frequencies of interest in this work are around 20 kHz. The gain is then about -38dB (5 microns paint thickness); which is defined as 20 \log_{10}[(T_p-T_h)/(T_\infty -T_h)]. This corresponds to a signal gain of 0.011 – or; for every 1 degree of temperature fluctuation in T_\infty, we get one hundreth of a degree of fluctuation in the paint surface temperature; which is quite a small signal. Since the TSP gives a signal of about 1.4%/ºC, we are talking about an intensity signal of I/I_0 = 0.014 \times 0.011 = 1.54 \times 10^ {-4}, or 0.6 counts/ºC of a 12-bit camera where I_0 exactly saturates the camera (most likely not). I expect this paint measurement system to (hardly) detect temperature fluctuations of 10ºC pk-pk; considering good post-processing and noise reduction to reject camera noise and a very tame signal (like a pure frequency).

I’ll post updates as we perform experiments!

Tying it all back to the fourier number

So we see this interesting transfer function that switches between the lumped RC circuit assumption (with a -20dB/decade decay) to the continuous transmission line (with a -10dB/decade decay). Why does that happen and where is the “kink” in the graph?

Well, we’ll see this ties back to the thermal theory; once we find the kink frequency: The kink is where the -20dB/decade and the -10dB/decade cross. At the -20dB/decade regime, a slab of material has behaves as a capacitor and its thermal impedance is:

\displaystyle Z_C=\frac{1}{j \omega C} = \frac{1}{j \omega C' L}

We have the definition of C' in the previous sections – so let’s focus on the math. The thermal impedance of the slab in the -10dB/decade regime is simply its characteristic impedance (another interpretation is that in the -10dB/decade regime the slab is effectively infinite).

\displaystyle Z_0=\sqrt{\frac{R'}{j \omega C'} }

The kink is located about where |Z_C|=|Z_0|:

\displaystyle \bigg|\sqrt{\frac{R'}{j \omega C'} }\bigg| = \bigg|\frac{1}{j \omega C' L}\bigg|

\displaystyle \frac{R'}{\omega C'} = \frac{1}{\omega^2 C'^2 L^2}

This gives us the radian frequency of the kink:

\displaystyle \omega_{kink}=\frac{1}{L^2} \frac{1}{R' C'}

Now you should already be recognizing that we have an interesting non-dimensional coming up, as \alpha = 1/R' C':

\displaystyle \omega_{kink}=\frac{\alpha}{L^2} \rightarrow \boxed{f_{kink}=\frac{\alpha}{2 \pi L^2}}

The non-dimensional is; obviously; the Fourier number:

\displaystyle Fo=\frac{\alpha}{\omega_{kink} L^2}=1

We usually see the Fourier number with a time in the numerator (Fo=\alpha t/L^2), but this is the same kind of transient phenomenon. If the Fourier number is larger than 1, the “time scales” are large (low frequencies) and the heat transfer of the slab can be modeled by a simple capacitor. If the Fourier number is smaller than 1, then the “time scales” are small (high frequencies) and the heat transfer needs the heat equation to be modeled, which we are simply doing by using the transmission line model. The analogy is quite powerful and enables simple, quick analysis of these systems.

I boxed the f_{kink} equation because it is quite useful to determine whether our AC analysis needs to consider the transmission line model; knowing the frequencies of interest. Hope this provides the insight you needed for your analyses!

Sound Visualization: The theory works! – Part IV

So in this (last) episode of our quest to visualize sound we have to do some sound-vis experiment; right? Well, I did the experiment with an ultrasonic manipulator I was working on a couple months ago. I built a Z-Type shadowgraph (turns out I didn’t have enough space on the optical table for a knife edge because the camera I used is so enormous) with the following characteristics (see part II for nomenclature):

\displaystyle f_2=2m

\displaystyle g=1m

\displaystyle L=0.13m

For this first experiment I used a phased array device with 64 ultrasonic transducers:

When I give all transducers the same phase, they are supposed to produce a plane-like wave. This is how the wave looks like in this shadowgraph setup:

Where the oscillation levels are plus and minus 50 levels from the background level. Thus, we have N_{lv}=100 and the expected pressure fluctuation at 40kHz is 257 Pa, corresponding to a SPL of 142.2dB. Quite loud!

A measurement with a very good microphone (B&K 4939) gave me a pressure fluctuation of 230 Pa – which is quite close to the expected value from the shadowgraph!

I made a little video explaining the details of the setup and there’s some really nice footage of acoustic manipulators I built on it. Perhaps you can find some inspiration on it:

So this post was rather short. Most of the content is in the video, so I hope you have a look!

Sound Visualization: Nonlinearities – Part III

As discussed in Parts I and II, we established that we can use a Schlieren or a shadowgraph apparatus to visualize sound waves. The shadowgraph is not as interesting an instrument, due to its stronger high-pass behavior. Nevertheless, both instruments are viable as long as one makes it sensitive enough to see the waves.

When reading the previous posts, the keen reader might be weary of my isothermal assumption when applying the ideal gas law. Shouldn’t temperature also be a spatial wave? The consequence, obviously, would be that the waves become sharper with increasing loudness and easier to visualize as the non-linearity kicks in. The reader would be rather right, the non-linearity effect can be significant in some cases. But for reasonable sound visualization, the pressure fluctuation amplitude has to be above about 1 order of magnitude of the ambient pressure in order for this effect to change the wave enough. Let’s look into the math:

For a compressible flow, assuming isentropic sound waves, we have:

\displaystyle \frac{\rho}{\rho_a} = \bigg(\frac{P}{P_a} \bigg)^{1/\gamma}

Where \gamma is the specific heat ratio for the gas under analysis. A pressure wave of the form P=P_a + P_0 e^{i(\omega t-kx)} then becomes a density wave of the form:

\displaystyle \rho=\rho_a \bigg[1+\frac{P_0}{P_a} e^{i(\omega t - kx)} \bigg]^{1/\gamma}

Taking the x derivative:

\displaystyle \frac{\partial \rho}{\partial x} = -\frac{i\rho_a P_0/P_a e^{i\omega t} (k/\gamma) \big[1+(P_0/P_a) e^{i(\omega t-kx)}\big]}{P_o/P_a e^{i\omega t} + e^{i kx}}

Staring at this expression for long enough reveals we can separate the non-linearity term from the remaining terms:

\displaystyle \frac{\partial \rho}{\partial x} \frac{\gamma}{\rho_a k} = -\frac{i P_0/P_a e^{i\omega t} \big[1+(P_0/P_a) e^{i(\omega t-kx)}\big]}{P_o/P_a e^{i\omega t} + e^{i kx}}=N(P_0)

The non-linear function N(P_0) represents the wave steepening as the amplitude becomes sufficiently high. Looking at the phasor in the denominator (P_o/P_a e^{i\omega t} + e^{i kx}) we see the derivative operator split the time fluctuations from the spatial component of the fluctuation. For our detectability purposes, what really matters is the minimum and maximum value of N(P_0) to get a sense of a peak-to-peak amplitude. In this case, the phasor minimum amplitude happens at:

\displaystyle \min |N(P_0)|=\frac{P_0/P_a \big[1-P_0/P_a \big]^{1/\gamma}}{1+P_0/P_a}

Similarly, the maximum happens when the phasor at the denominator minimum point:

\displaystyle \max |N(P_0)|=\frac{P_0/P_a \big[1+P_0/P_a \big]^{1/\gamma}}{1-P_0/P_a}

Note this value can even be zero, at about 194dB SPL, in which case these equations blow up to infinity. Nevertheless, the difference gives us the peak-to-peak amplitude of the non-linear derivative of density:

\displaystyle N_{pk-pk}=\frac{P_0}{P_a} \frac{\big[1+P_0/P_a \big]^{1/\gamma} - \big[1-P_0/P_a \big]^{1/\gamma}}{1-(P_0/P_a)^2}

Finding the minimum SPL as a function of wavelength is not as simple, since the non-linear function makes it difficult to isolate P_0. However, we can do the opposite – isolate the frequency and plot the points with the axes flipped. To do so, let’s isolate the wavenumber:

\displaystyle k=\frac{2}{\rho_a} \frac{\partial \rho}{\partial x}\bigg|_{\min} \frac{\gamma}{N_{pk-pk}}

The factor of 2 comes from the fact that the calculations performed in the previous section considered the amplitude of the sine wave, not its peak-to-peak value. Now we can replace the equations for the Schlieren sensitivity to obtain:

\displaystyle F=\frac{c}{\pi \rho_a } \frac{\gamma}{N_{pk-pk}} \bigg[\frac{1}{G(\lambda)} \frac{n_0}{L} \frac{a}{f_2} \frac{N_{lv}}{2^{n_{bits}-1}} \bigg] (Schlieren)

For the shadowgraph, a second derivative needs to be taken. But as shown in the chart below, the non-linearity only becomes relevant above about 180dB. In the case you get to these SPL values, you probably have a lot more to worry on the compressibility side of things. But it comes as good news that we can use a simple relation (like the one derived in Part I) to draw insight about sound vis.

Comparison of minimum sensitivity for the linear (black) relationship derived in Part I and the compressible, non-linear relation derived here. Parameters are the same: f_2=2 m, a=1 mm, L=0.1 m and N_{lv}=5

Schlieren vs Shadowgraph for Sound Visualization – Part II

This continues our saga we started at Part I (spoiler alert: you’re probably better off with a Schlieren). Thanks to a good discussion with my graduate school friend Serdar Seckin, I got curious about applying the same sensitivity criterion to a shadowgraph system. Turns out, Settles’ book also has the equations for contrast in the case of a shadowgraph (equation 6.10). It is even simpler, since the contrast is a purely geometrical relation:

\displaystyle C_x=\frac{\Delta E_x}{E}=\frac{\partial \epsilon}{\partial x} g

Where g is the distance between the “screen” (or camera focal plane, in case of a focused shadowgraph) and the Schlieren object. Since the ray deflection occurs in the two directions perpedicular to the optical axis (x and y), we get a contrast in the x direction (C_x) and another in the y direction (C_y). The actual contrast seen in the image ends up being the sum of the two:

\displaystyle C = C_x + C_y=g \bigg(\frac{\partial \epsilon}{\partial x} + \frac{\partial \epsilon}{\partial y}\bigg)

Since the ray deflection \epsilon is dependent on the Schlieren object (as discussed in Part I), we can plug its definition to get:

\displaystyle C = \frac{L g}{n_0} \nabla_{xy}^2 n

Which is the “Laplacian” form of the Shadowgraph sensitivity. Laplacian between quotes, since it does not depend on the z derivative. We can plug in the Gladstone-Dale relation and the ideal gas law to isolate the pressure Laplacian:

\displaystyle \nabla_{xy}^2 P = \frac{C R T}{G(\lambda)} \frac{n_o}{L g}

Replacing the traveling wave form P=P_0 \exp(i[\omega t - \kappa_x x - \kappa_y y]), now obviously considering the two-dimensional wavenumber in the directions perpendicular to the optical axis:

\displaystyle P_0 = -\frac{\big(\nabla_{xy}^2 P \big)\exp(-i[\omega t - \kappa_x x - \kappa_y y])}{\kappa_x^2 + \kappa_y^2} = -\frac{\big(\nabla_{xy}^2 P \big) P \exp(-i[\omega t - \kappa_x x - \kappa_y y])}{\kappa_{xy}^2}

Where \kappa_{xy}^2 is the squared wavenumber vector magnitude, projected in the x,y plane. Applying the same criterion of N_{lv} levels for minimum contrast, we get the following relation for the minimum detectable pressure wave:

\displaystyle P_{min} = \frac{RT}{G(\lambda)} \frac{n_0}{Lg} \frac{N_{lv}}{2^{n_{bits}-1}} \frac{1}{\kappa_{xy}^2}

Which we can easily convert to minimum SPL based on the wave frequency F:

\displaystyle \boxed{SPL_{min} = 20\log_{10} \bigg(\frac{RTc^2}{4\pi^2 P_{ref}G(\lambda)} \frac{n_0}{Lg} \frac{1}{F^2} N_{lv}\bigg)- 6.02(n_{bits}-1)}

design consequences

The boxed formula for shadowgraph is very similar to the one for Schlieren found in Part 1. By far, the most important point is the dependence on 1/F^2 [Shadowgraph] instead of 1/F [Schlieren]. This is obvious, since Schlieren detects first derivatives of pressure, whereas Shadowgraph detects second derivatives, popping out an extra F factor.

From a detector apparatus perspective, however, this is not of the most desirable features: The shadowgraph acts as a second-order high-pass spatial filter, highlighting high frequencies and killing low frequencies. The Schlieren has a weaker, first-order high pass filtering behavior. But this means they have a crossing point, a frequency F where the sensitivity is the same for both. The apparatus parameters are different, however we can do the exercise for the rather realistic apparatus given in Part 1. If we make g=0.5 m (i.e., the focal plane is 0.5 meters from the Schlieren object), we get the following curve for a 12-bit camera:

Minimum detectable sound levels – Schlieren versus Shadowgraph. Schlieren parameters: f_2=2 m, a=1 mm. Shadowgraph has g=0.5 m. Both consider N_{lv}=5 and L=0.1 m.

If the news weren’t all that great for the Schlieren apparatus, the Shadowgraph is actually worse: The crossing point for this system occurs at F=200 kHz. There might be applications where these ultrasound frequencies are actually of interest, but I believe in most fluid dynamics problems the Schlieren apparatus is the to-go tool. Remembering that, even though I show SPL values above 170dB in this chart, the calculations considered a constant-temperature ideal gas, which is not the case anymore at those non-linear sound pressure levels.

With this, I’m declaring Schlieren a winner for sound-vis.

Seeing Sound Waves with your Schlieren apparatus – Part I

Tell me: Would you be excited to see sound? Well, I sure damn was when I first attempted to see the sound produced by an ultrasonic transducer! And still am, to be honest! So let’s learn how to do it with a Schlieren apparatus and the sensitivity considerations necessary in the design of a Schlieren optical system. For this, I’ll use my handy Schlieren and Shadowgraph techniques by no one else than the great Prof. Gary Settles. My PhD advisor, Prof. Alvi, was his PhD student – so I should get this right!

I’m a practical engineer, but I’ll derive the formulas in this section. If you want, you can skip ahead to the next section where I apply them and produce nice design charts.

the math

If you would like more details on the physics, please refer to Settles. This is more of a derivation section. We’ll start with the Gladstone-Dale law, which relates the index of refraction of a gas to its density:

\displaystyle n-1 = G(\lambda) \rho

Where n is the index of refraction, G(\lambda)=2.2244\times10^{-4}\big[1+(6.37132\times 10^{-8}/\lambda)^2\big] m^3/Kg (for air) is the wavelength-dependent Gladstone-Dale constant and \rho is the density of the gas.

Assuming a parallel light beam, the beam deflection \epsilon of a uniform Schlieren object of length L is given by:

\displaystyle \epsilon = \frac{L}{n_0} \frac{\partial n}{\partial x}

Where x is the direction of deflection. This deflection is magnified in the focal point of the optical apparatus mirror/lens close to the camera/screen; where the knife-edge is placed, according to:

\displaystyle \Delta a=f_2 \epsilon

Where \Delta a is the change in the focal point location in the direction perpendicular to the knife edge and f_2 is the focal length of the second optical element (lens/mirror). Then, the contrast change C observed in the camera/screen is given by Equation 3.6. in Settles’ book:

\displaystyle C=\frac{\Delta E}{E} = \frac{f_2 \epsilon}{a}

We can use the ideal gas law to combine the expressions as a function of \partial P/\partial x:

\displaystyle \frac{\partial P}{\partial x} = \frac{RT}{G(\lambda)} \frac{a}{f_2} \frac{n_0}{L} C

We can then assume we are looking for sinusoidal sound waves of the form P(x,t)=P_0 \exp(i[\omega t - \kappa x]):

\displaystyle \frac{\partial P}{\partial x} = i\kappa P_0 \exp(i[\omega t - \kappa x])

OK. Now let’s stop here for a while. I think it goes without saying you’ll probably need a high speed camera to see the sound waves, as they aren’t among the mundane things you can use your regular DSLR to observe. Given that, we can consider whichever camera you’ll use has some bit depth n_{bits}. When setting up your Schlieren apparatus, you’ll probably use half of the dynamic range of the camera for the “background” frame. This means, your intensity E is about half of the dynamic range of the camera. In bit count, you’ll have \displaystyle E=N_{BG}=2^{n_{bits}-1} as the background, undisturbed luminance level. Now, let’s assume you need at least \Delta E = N_{lv} levels to distinguish the signal from the noise. Maybe you have some de-noising method, like POD, that improves your SNR and your sensitivity. How loud do the sound waves need to be in order to produce a N_{lv} signal given you have a N_{BG} background luminance?

Let’s isolate P_0:

\displaystyle P_0 = \frac{\partial P}{\partial x} \frac{1}{\kappa} \underbrace{\big\{-i\exp(-i[\omega t - \kappa x])\big\}}_{\text{included in } N_{lv}} = RT \frac{1}{\kappa} \frac{1}{G(\lambda)} \frac{a}{f_2} \frac{n_0}{L} C

Using our bit depth, we get the minimum pressure disturbance to be:

\displaystyle P_{min} = RT \frac{1}{\kappa} \frac{1}{G(\lambda)} \frac{a}{f_2} \frac{n_0}{L} \frac{N_{lv}}{2^{n_{bits}-1}}

Converting to sound pressure level (SPL):

\displaystyle SPL_{min} = 20 \log_{10} \bigg(\frac{P_{min}}{P_{ref}} \bigg) = 20 \log_{10}\bigg({\underbrace{\frac{RT}{P_{ref}}}_{1/\rho_{ref}} \frac{1}{\kappa} \frac{1}{G(\lambda)} \frac{a}{f_2} \frac{n_0}{L} \frac{N_{lv}}{2^{n_{bits}-1}}}\bigg)

\displaystyle SPL_{min} = 20 \log_{10}\bigg(\frac{RT}{P_{ref}}\frac{1}{\kappa} \frac{1}{G(\lambda)} \frac{a}{f_2} \frac{n_0}{L} N_{lv}\bigg) - 20 (n_{bits}-1) \log_{10}(2)

Evaluating the \log_{10}(2) and replacing \kappa = 2\pi F/c:

\displaystyle \boxed{SPL_{min} = 20 \log_{10}\bigg(\frac{cRT}{2\pi P_{ref}} \frac{1}{G(\lambda)} \frac{1}{F} \frac{a}{f_2} \frac{n_0}{L} N_{lv}\bigg) - 6.02(n_{bits}-1)}

A capital F is used for frequency to eliminate confusion with f, which in this context is the mirror focal length. If you know the background level, N_{BG}, then you might want to expose it in the formula instead of the bit depth of the camera pixels:

\displaystyle SPL_{min} = 20 \log_{10}\bigg(\frac{cRT}{2\pi P_{ref}} \frac{1}{G(\lambda)} \frac{1}{F} \frac{a}{f_2} \frac{n_0}{L} \bigg) + 20 \log_{10}\bigg(\frac{N_{lv}}{N_{BG}}\bigg)

usage in design

The boxed formula is extremely useful for sound visualization design, as it allows us to define whether we can even see the sound waves we want to. First, a few remarks: I separated the bit depth because the 6 n_{bits} appears in so many places. In this case, we have it with a negative sign, which is awesome, since in the end we want to minimize SPL_{min} to maximize sensitivity. The first fraction inside the \log() is a constant, dependent on the gas and the reference pressure. For air at 25ºC, cRT/2\pi P_{ref} = 2.335\times 10^{11} m^4/Kg\cdot s.

Note that the minimum SPL is inversely proportional to the frequency F. This means, higher frequencies are easier to see. Ultrasonic frequencies, in particular, are somewhat straightforward to see, as we will discuss. Let’s plot SPL_{min} as a function of frequency for a typical large-scale Schlieren apparatus:

Sensitivity of an air-based Schlieren aparatus with f_2=2 m, a=1 mm, L=0.1 m and N_{lv}=5

The black line represents the bit depth of most contemporary scientific cameras. As you can see, the news aren’t that great: For low frequencies, we have to scream really loud to see anything – and this is after blowing up the image such that we can only see 5 levels! For example, for a frequency of 5kHz we have about 128dB of minimum sound pressure level to be perceived by our camera. It’s not impossible, and with the advent of new data processing techniques like POD it is well feasible. But wouldn’t it be great to have 4 more bits per pixel (red dashed line)? That would bring the minimum SPL to 104dB, which would enable the visualization of a lot of phenomena (and even a rock concert!).

Well, I hope this helps you design your own Schlieren apparatus. Or maybe you lose hope altogether and quit – but at least this saved you time! If you want to design your own Schlieren setup to visualize sound waves, you can download the code that generated the chart above here. Anyhow, here’s a little video of how sound waves look like for a really loud (~140dB, ~5kHz) supersonic jet:

The anti-gravity piddler: A demonstration of aliasing

So you’ve probably already seen demos on Youtube showing this really weird “camera effect” where they stick a hose to a subwoofer and get the water to look like it’s being sucked back to the hose, seemingly against gravity.

I personally love this effect. In the case of the subwoofer, the effect is due to what is technically called “aliasing”. Aliasing is an effect important to all sorts of fields, from data analysis to telecommunications to image processing. In technical jargon, you get aliasing when you don’t satisfy the Nyquist criterion when sampling your signal. This might not be accessible to everyone, so I’ll explain it differently.

In the case of the hose stuck to the subwoofer, the speaker shakes the hose back and forth with a single frequency (a single tone) and generates a snaking/spiraling effect on the water stream. If the water stream is slow enough (that is, if its Reynolds number is low enough for the flow to be laminar), then no weird stuff (non-linear effects) occur, and we get a simple, single-toned spatial wave in the water jet. That being the case, the cycles repeat very nicely, becoming indistinguishable from each other. If you can fulfill this criterion, then aliasing also occurs in a nice manner, that is, if you happen to fail to satisfy the Nyquist criterion, you don’t get a jumbled mess but a nicely backward or forward motion that looks like it’s in slow motion.

It is a simple thing to do, but there’s some beautiful fluid dynamics on it. Generating repeatable patterns and laminar flows is not that simple, especially when you are engineering a device. If you attempt to reproduce the video linked, you’ll find yourself suffering through a parametric search of flow rate/shake amplitude until you get the right combination that displays a nice effect.

Here, I’ll discuss a different device, though – I’ll talk about the piddlers of Dr. Edgerton, that inspired awe in many people around the world – including myself. I have never personally seen one. But I understood what it was and that the effect was not just a camera artifact, but something that could be seen with the naked eye because they use a stroboscopic light to show the effect to the viewer. I have not – as of 2018 – found any instructions on how to make these. And since it turns out it’s quite simple, I think it should be popularized. Here’s my take:

the piddler design considerations no one talks about in practice

The water piddler is a device that generates a jet of water droplets. Water jets are naturally prone to breaking down into droplets – if you’re a man you know it! But on a more scientific tone: Water jets are subject to a fluid dynamic instability called the Rayleigh-Plateau instability. This document here is an incredible source that enables the prediction of what are the conditions for this unstable behavior without hassling you with all the complicated math behind. The Rayleigh-Plateau instability looks like shown below:

Naturally occurring Rayleigh-Plateau instability in my kitchen

It’s beautiful – but it is also not really a single frequency. There seems to be some level of repeatability to it, but not enough to make the strobe light trick work. The reason for this non-repeatability is the following curve:

Dispersion relationship for the Rayleigh-Plateau instability

This is the dispersion relationship, extracted from equation (23) of Breslouer’s work. It corresponds to an axisymmetric disturbance in the radius of the jet – R(z,t)=R_0 + \varepsilon e^{\omega t + ikz}, where \varepsilon is a small disturbance in the radius R and z is the streamwise coordinate. k gives us a wavenumber of the ripple in the jet, and \omega gives us a frequency of this disturbance. The dispersion relation normalizes the wavenumber k by the original radius of the stream, R_0. When \omega >0, we get an exponentially growing disturbance in the radius, which eventually makes the jet break down into droplets. So the black curve in the chart shows that any disturbance between 0<k R_0<1 will grow, but higher frequency disturbances will not – they simply oscillate. Disturbances closer to the peak at kR_0 = 0.697 will grow faster, which is an important design guideline when we want to break down the jet into a stream of droplets.

The problem, though, is that any other frequencies around the peak also grow. The peak is somewhat smooth, so there will be a lot of space for non-uniformity, especially when the disturbances themselves start at different amplitudes.

So what would be a good design procedure? Well, first, we need to make sure the jet will be laminar. One way to guarantee that is to make the Reynolds number of the nozzle that makes the stream lower than 2000. That guarantees the pipe flow is laminar, which in turn makes the stream laminar. Of course, this is a little limiting to us because we can only work with small jet diameters. You can try to push this harder, since the flow inside the stream tends to relaminarize as the stream exits the nozzle due to the removal of the no-slip condition generated by the nozzle wall.

The other constraint has to do with reasonable frequencies for strobing. You don’t want to use too low of a strobe frequency, because that is rather unbearable to watch. Strobe frequencies must be above 30Hz to be reasonably acceptable, but they only become imperceptible to the human eye about 60Hz. We get a design chart, then:

Design chart for f=60Hz, water/air, 25ºC

The chart shows the growth rates (real part of \omega) for combinations of realistic jet diameters and velocities, which are the actual design variables. The line of constant Reynolds number looks like a 1/x curve in this space. The white line shows the upper limit for laminar pipe flow. You want to be under the white line, as well as in the growing region, which is about the region enclosed by the dashed line. For higher frequencies, the slope of the \approx 45^\circ black boundary decreases, meaning you need smaller diameters to make the strobe light work. For lower frequencies, the slope increases, improving the available parameter space, but too low frequencies will be uncomfortable to watch. In case you want to develop your own piddler, a Matlab implementation to generate the colorful chart above is here.

It is actually rather remarkable that the parameter space looks like this, because feasible diameter/frequency combinations actually will break down into droplets if excited with 60Hz – the line frequency in the US. Say, for example, for 3mm jet diameter and 1m/s speed, we have a high growth rate and the piddler will produce a nice effect. At 6mm, 0.5m/s, we still have laminar flow but the instabilities won’t grow at 60Hz (lower frequency instabilities grow instead). Thus, you’ll not get a good piddler out of that combination. You might be able, for example, to push the bounds a bit (which I did) and make the jet diameter 4.75mm and the jet speed about 1.2m/s. In that case, the Reynolds number is about 3200, which still makes a reasonably repeatable piddler pattern.

Another thing you can attempt (I did) is to try to use a more viscous fluid. More viscous fluids will increase the viable diameter/velocity combinations where the Reynolds number is still low by pushing the white line up and to the right. For example, propyleneglycol allows us to approximately double the diameter of the pipe. The problem, obviously, is that it’s incredibly messy!

now to the real world

The design map is a good guideline to start this, but there are a few tricks that no one really describes in the internet. I’ll save you weeks of suffering: The best way to generate the disturbance is to use a coffee machine pump. Yes, a coffee machine pump! It accomplishes the two tasks for this device: Recirculating the fluid and generating a strong enough, single-frequency disturbance such that you don’t really need to trust the Rayleigh-Plateau instability alone to generate the droplets.

This is the basic implementation of the piddler I built. The coffee machine pump is a ULKA-branded one (like this one). I believe it doesn’t really matter which model you use, since mine had too much flow rate and I had to reduce the flow with the flow control valve indicated in the schematic. These pumps are vibratory pumps. They function by vibrating a magnetic piston back and forth with a big electromagnet. The piston pumps a small slug of fluid, that goes through a check valve inside the pump. When the piston returns, the suction shuts off the check valve, preventing back-flow. Suction fills the piston cavity of new fluid, and the cycle repeats.

Since the piston is pumping the fluid in tiny slugs at the correct frequency (assuming you have a viable Rayleigh-Plateau instability on your design), an acoustic wave will go through the water until the nozzle, generating the intended velocity disturbance in the mean flow. It will not be a choppy flow, but an oscillating one, due to the really strong surface tension forces in water. The figure in the left shows that there’s a full wave of instability before the stream breaks down into droplets.

Now that we discussed the design, let’s go for a little demo. In this video, I’ll also go through the z-Transform, which is a cool mathematical tool for modeling discrete-time control systems. I used this piddler as a prop for the lecture!

Smoke rings to the tune of AC/DC

So I’ve been spending quite a bit of time thinking about vortex rings. Probably more than I should! I decided I wanted something that shot vortex rings filled up with smoke, but in a way that can last for very long periods of time. I came up with this idea that if I had an ultrasonic mist maker, I would be able to generate virtually endless fog that I could use for this. As a tease, this is what I came up with:

Video also available here. Smoke in the Water here.

So hopefully you also found this cool! To be honest, I don’t know how this isn’t a thing yet – this could very well be a product. (i.e., it still performs the function as a room humidifier, but vortex rings are just cooler!). Ok, so how did I do it?

What’s a vortex ring?

A vortex ring is a (rather important) fluid mechanical structure. It is present in pretty much all realistic flows and plays an important role in turbulence. Generating one is simple: When fluid is squeezed through a round nozzle, it forms a temporary jet that “curls” around the edges of the nozzle. Since the nozzle is round in shape, the curling happens all around, like a ring of spinning fluid. If the “squeezing” stops, the curling continues, though, through inertia. One thing we learn in fluid mechanics is that a vortex (this curled fluid structure) induces a velocity everywhere in the flow field – i.e., it tries to spin everything around it. If the nozzle blows upward, the left-hand side vortex core induces an upward speed on the right-hand side. The same happens from the right-hand side vortex core, it also induces an upward speed on the left-hand side. It actually happens all around the circle, meaning the vortex ends up propelling itself upward.

If the flow of the vortex ring is somewhat laminar and we seed it with smoke, we can see the vortex ring propelling itself as a traveling ring (as in the video) because it persists for quite a long time. Eventually, it becomes unstable and stretches until it twists and crosses itself, rapidly breaking down to tiny vortices and spreading itself in a turbulent cloud.

How do I make one?

You need a means of generating smoke. Smoke machines used in stages / parties is generally the easiest way to get started. You fill a bucket with smoke, have a hole about 1/4 of the bucket diameter on one end, and then tap the opposite end. This replicates the “squeezing” process described before. It is not really an ideal solution, though, because the smoke fluid has to be replenished quite often. Plus, routing the smoke from the machine to this device that produces the smoke rings is not really easy (the smoke condenses in the walls of a pipe and forms a lot of oil in it).

So this idea struck me. If I use an ultrasonic fog generator (like this one), then I can produce ungodly amounts of smoke from a relatively small water tank. This smoke can last for hours and be stored in the water tank to increase its density. This is what I came up with:

A speaker is connected to a little water bucket (an ex-ice cream container) through this funky-looking black 3d-printed part. It’s just a duct that adapts the size of the speaker to the size of the orifice in the bucket. The bucket has about 120mm height, and the water level is about 70-100mm. The ultrasonic transducer is simply submerged in the bucket, generating tiny water droplets (a mist). The mist will mostly stay in the container, since gravity makes the droplets rain back to the water eventually. The tank lid has a nozzle, which is the only exit available for the air and the mist, once it is pushed by the speaker’s membrane. Thus, the speaker acts as a piston, an electromechanical actuator, and displaces air inside the bucket. In the forward stroke, it squeezes the air out, forming a vortex ring. In the return stroke, it draws the air back in. The waveform has to be asymmetric, such that the suction power is less than the blowing power. Otherwise, the vortex rings are sucked back into the nozzle, and though they do propel, a lot of their niceness is destroyed.

The figure above shows the best waveform shape I found for driving the speaker. It is quite important to get the waveform right! Even more importantly, it is crucial to DC-couple the driver. If you AC couple this waveform, it will not work at the low frequencies (i.e. 1-2Hz).It’s easy to test with a function generator, since the waveform is already DC coupled. In the end application, however, I ended up building a Class-D amplifier, without the output filter stage. The speaker itself removes the high frequency content due to its inductance.

I would share my design (mechanical drawings, etc). But this is such a custom-built device to fit a random ice cream container I found that there’s no point in doing that. I’m sure if you are determined enough to make this, you’ll find your way! A few tips:

  • Fog height between water level and top of the tank is somewhat important. The particular fog machine I used generates 30-50mm of fog height above the water level. If the fog is not to the top of the tank, when the speaker pumps the fluid out there will be no fog carried with it, which will result in an un-seeded vortex ring, ruining the visual effect. I found that the fog doesn’t overflow through the nozzle even when the tank lid is closed, even with a high water level.
  • The displaced volume is important. The larger the speaker size (I used a 4″ speaker with a 20mm nozzle), the less it has to displace to produce a nice vortex. A ratio between speaker diameter and nozzle diameter of 5 seemed to work well to me.
  • Remember, velocity is dx/dt. This means when you increase the frequency of the signal, the velocity increases linearly (2x frequency, 2x velocity). This means that, as you increase the frequency of the signal, you don’t need as much amplitude to generate the same exit velocity. Since exit velocity roughly determines the vortex circulation and, therefore, the vortex Reynolds number, you want to keep that number the same in your experiment. Say, if you double the frequency keeping the amplitude of the voltage signal, you’ll get twice the exit velocity, which will make the vortices shoot twice as fast (i.e., they’ll go further) and with twice the Reynolds number (i.e., they will become turbulent and break down earlier). There’s a balance to strike here.

I hope this gives you some ideas!

Cutting mathematical sheets

Mathematics in the complex plane are sometimes surprisingly difficult to understand! Well, the complex numbers definitely earned their name! Maybe you’re also studying complex analysis, or have studied it in the past and didn’t quite understand it. The fact is, it requires a lot of imagination to see the concepts.

I sometimes like to compensate for my lack of imagination by writing an app that visualizes the things I wanna see. This one, I would like to share with others, so I wrote it in Javascript. You can have a look at it here.

It’s basically a way to visualize how a given variable z maps into a function f(z). If f(z) is multi-valued (like, for example if f(z)=sqrt(z)), then the complex map “compresses” the entire z plane in a fraction of the f(z) plane. In the example f(z)=sqrt(z), as you probably already learned, this fraction is 1/2 (since the exponent is 1/2). The function f(z)=sqrt(z) is then said to have two branches. Functions that have this behavior will have a branch point, which is this point where as you go along 360º in a small circle around it, the function f(z) does not make a 360º arc. The function f(z) then becomes discontinuous, “it branches”. The discontinuity is actually along a curve that starts at the branch point, and this curve is called “a branch cut”. The branch cut, however, can be any curve starting at any angle. It just needs to start at the branch point.

The mind-numbing part starts to happen when the argument of the multi-valued function (for example, sqrt(z) and log(z) are the most commonly seen functions in math) is another function. Then, the branch points lie at the roots of the argument. If there are multiple complex roots, each one of them will be a branch point, from which a branch cut has to be made. Determining one of the branch cut curves, however, determines all of the remainder branch cuts.

Let this settle a bit. Let’s say you place a cut in the target function f(z) around the branch point as a line at an angle ϑ. z in polar form can be written as z=R exp(iα). The reason why branches occur in a complex function is because f(R exp(iα))≠f(R exp(iα+2πi)) , even though z=R exp(iα)=R exp(iα+2πi) (i.e., as you go around a full rotation, the point in the complex z plane is the same. But f(R exp(iα)) is uniquely defined. This means that the function f(z) itself determines how much angular displacement one full rotation about a branch point in z incurs in the f(z) plane. Therefore, defining the branch cut line automatically defines the region of f(z) that is available in the mapping z->f(z), which is “the branch” of f(z).

The app I developed computes that, for simple functions like sqrt(z+5). However, it is not general. It assumes that the branch cut starts at (0,0) in the f(z) plane. If the branch cut has to start at a different coordinate (say, for f(z)=sqrt(z)+1 it would have to start at (1,0) ), then the app mapping does not work correctly. It, nevertheless, gives some insight (in my opinion). Especially for a student! If you wanna develop it further, let me know!!

An example

Consider the function f(z)=sqrt(z^2-4). This function has two branch points, i.e., where the argument of sqrt() is zero. These points are the roots of z^2-4, which are +2 and -2. The square root function maps the full z plane to a half plane, which is why in the app it appears like this:

On the right-half, we see the f(z) plane. The color represents the angle, i.e., arg(f(z)). We see the upper half plane in vivid colors, whereas the lower half is ‘dull’. The upper half is the current branch. The branch cut is defined, interactively, by the white solid line. The other branch cut line is, as discussed, automatically defined in the dashed line by the half-plane mapping the square root function gives. This upper-half plane branch cut, shown in the z plane, looks like two lines spanning out from the branch points (+2 and -2). They appear in the left-hand figure as a discontinuity in the color (which represents the angle of f(z)).

The cool thing is that, by dragging the little black dot in the right hand figure, we can move the branch cut interactively. Here’s a video of what happens if we go around the circle:

And another one going though other random functions:

I hope this helps you as much as it helped me understand these concepts! It’s actually quite cool once you visualize it!

Driving a hundred solenoid valves

As I discussed in this past post about MIDIJets, I was attempting to make a platform for surveying microjet actuator location and parameters in aerodynamic flows for my PhD research. But I think this is something that can be quite useful in many other contexts. After working with this for a couple months now and realizing how robust the driver I developed was (yes, I’m proud!), I decided to release this project as an open-source hardware. Maybe someone else might find this useful?!

With that said, the project files can be found at this GitHub page: . The files should be sufficient for you to both build your own board, program it with a PICKit4 (I’m pretty sure you should be fine with older PICKit versions) and communicate with the Serial port through a USB connection.

What can I do with it?

Now, let’s talk about the device’s uses. Being able to control many solenoids with a single board can be very useful. In my case, the application is aerodynamic research. We can activate or energize a boundary layer of a flow. But maybe the applications could transcend aerodynamic research? Imagine a haptic feedback glove that makes vibrating air jets on your fingers, how cool would that be? Or maybe an object manipulator by controlling where air is issuing from? I think there’s some other possibilities to be explored. If you would like to replicate this, let me know.

Visualizing the jets

Here’s some quick flow-vis showing the pulsating jets with a small phase delay of 60º. Just as a reminder, visualizing jets of 0.4mm diameter is not easy – so I apologize if the video looks noisy! There’s a dust particle floating in the air in some frames. That’s kinda distracting but is not part of the experiment!

Some caveats

Well, I’m a mechanical engineer, so board design is not really something I do professionally. Therefore, expect some issues or general weirdnesses with my design. If you’d like to replicate this, I used a Matrix Pneumatix DCX321.1E3.C224 solenoid. It is not a large valve. The right connector is on the project BOM. The issue is that this valve is a high voltage, low current valve (24V, 50mA). The driver shield I designed has those specs in mind. This means a different driver circuit would probably be needed for valves with different specs. Also, for higher currents, be mindful that the motherboard carries the current through it, possibly generating some noise if the driver current is too high (yes, I was not very smart in the board design!).

In conclusion

Well, I hope you found this mildly interesting. If you think you could use this project and you made something cool inspired by this, I would be pleased to know!