SDOPP – on creating Synthetic Datasets for One Photo Photogrammetry

It must have been over a year ago, that I first had this idea.

I was admiring a “speed modelling” video: a time lapse of someone creating a digital 3D sculpture based on a single picture.

As someone who was only just starting to explore the world of ML, I remember naively thinking “Ah! This is a perfect job for AI! If a person can do it, then it must be easy to train a neural network to do the same thing!”

I started thinking about ways to generate this dataset, and soon I had an answer. I had a way to synthesise as many data pairs as I like. One piece of data would be a 2D image, and its partner would be a 3D model of the scene in the picture.

A 2D image of a face that has been automatically reconstructed in 3D
Previous work has successfully produced domain-specific 3D reconstructions from pictures, such as faces (above) or furniture and vehicles. Nothing (that I could find) has aimed to create a general solution that is capable of reconstructing previously unseen classes.

The original idea

Conceptually, my idea was simple. I would write a script in Blender 3D that creates random scenes, and renders them.

At the face of it, it doesn’t sound very useful to have a lot of rendered randomness. But alongside the rendered image, my script would also export the 3D model that generated it. This would give you the exact type of labelled data that supervised machine learning algorithms love. Your “X”, the input, would be the rendered image. Your “Y”, or the output, would be the 3D geometry of the picture.

In this time, I had a fairly limited understanding of machine learning. My idea of a neural network was something that magically takes one type of data and transforms it into another.

As such, my plan was rather simplistic, as it just involved creating arbitrary landscapes in Blender, rendering them, and hoping for the best.

After gaining just a small amount of experience in the field, I realised a few changes that I had to make if this was to have any chance of succeeding.

Making the idea more feasible

Completing Andrew Ng’s excellent machine learning course gave me more time to reflect on the idea, and the limitations of my existing model.

I realised that a model trained on complete randomness would not generalise to real life objects, because objects in real life are not random. They follow a number of patterns, with subtle relationships between any two parts of an object. For example, a trained 3D artist may be able to model a whole 3D head from a single profile photo, but only because she knows that most faces are roughly symmetrical. Or, she could model a table with a hidden leg, because she knows it is likely to look like the other legs.

This sort of interpolation of unseen data was essential, as I wanted the model to reconstruct the unseen parts of objects. The aim was for this trained model to take a single photograph, taken from any angle, and reconstruct the whole object, including the side facing away from the camera. As a side effect (at least in theory), a model with this sort of understanding would also be able to handle obscured data in photos.

Look around you. What do you see? Chances are that most objects you see have at least one line of symmetry: chairs, people, screens, cars are to name a few. Our inherent understanding of the patterns within objects is key to an artists ability to recreate parts she cannot see.

A compromise

So the challenge was to build a dataset that conveys these relationships between parts of an object. A simple solution would be to train on 3D scans of everyday objects. However, available datasets that I could find were quite small. I feared that a model trained on ordinary examples might not generalise to new objects well. The danger of overfitting a dataset like this is also present, because of the likely complexity of a neural network capable of this sort of reconstruction.

I faced a dilemma. On one hand, I wanted a neural network trained on this dataset to be as robust as possible. The challenge would be to produce a diverse dataset without resorting to generating noise.

I settled on a compromise. My script would take a seed set of 3D models (no corresponding image necessary), and produce permutations of them. For example, it would put these objects into a number of realistically lit scenarios, and render from different angles. It would also add distortions to the objects so that even a single seed object could lead to infinite variations.

Because the ultimate aim is to create a reconstruction model that can generalise to unseen classes, using a seed set of models might seem to defeat the point. However, I believe a framework like can learn to reconstruct a large number of classes very easily. To train on an additional class will take just a single example 3D model. It is conceivable that a network that is capable of creating 3D reconstructions of a sufficiently diverse set of classes may generalise to new ones, as long as they bear resemblance to a previously seen one.

The script

I’ve created a simple Blender script that aims to do this. As I write this post, it is creating its first batch of training data. Once I have attempted to train a model on it I will be posting the results here on my website.

Before running the script, the user must choose 3 options:

  • The script is capable of simulating outdoor or indoor lighting. As such, the user can choose what percentage of the rendered images should be indoors, and what percentage should be outdoors. This will be useful in sets that are predominated by e.g. cars and houses (all largely outdoors), or furniture and household items (all largely indoors).
  • The second setting is how many pairs of data to produce. If there is no limit, this can be set to 0 to produce data indefinitely. I recommend setting it to 1 for testing, as Blender becomes unresponsive while the script runs.
  • Thirdly, and lastly, the user should point the script to a folder containing the seed dataset in the form of .stl files. The outputs will be in a subfolder called “exports” that is placed within this folder.

How the script works: basic steps

Here is a brief outline of how my script works:

  • Firstly, it loads all .stl files found in the selected folder. These are added to a separate layer in Blender.
  • An inner loop runs, depending on the number of data pairs the user wants to create. It will also run indefinitely if the user sets it to create “0” data pairs.
  • Each time the loop runs, it carries out these steps:
  1. The script randomly picks an object and duplicates it to the main Blender layer (layer 1).
  2. It decides whether to simulate an indoors or an outdoors scene, depending on the probability assigned by the user. The script then simulates the scene.
  3. The script creates the camera in a random location, though it always points at the subject. Focal length is chosen intelligently so the subject always takes up a substantial portion of the view , regardless of the camera distance.
  4. A simple distortion is applied to the object to add variation. At the moment, this rudimentary. Currently, this simply involves taking a random subset of vertices which are scaled and translated.
  5. The resulting object is rendered. The resulting image is saved.
  6. The final, distorted .stl is also exported to the same folder with the same file name.


Known bugs:

  • There is currently a bug which means that when an .stl files start with a letter, it must be a capital letter.

Possible limitations:

  • To introduce more variation, the current form of the script randomly rotates objects in all axes. It would perhaps be more realistic to only rotate them about the z axis. After all, a chair can be facing left or right, but it is rare to see one upside down.
  • It is heavily dependent on the quality of the dataset (like all projects involving synthetic datasets).
  • It is likely to be a high bias problem, requiring training on vast datasets.

I considered alternative ways to create this type of dataset:

  • A 3D scanner could create many scans of everyday objects, with a corresponding picture taken of each. Though this data would be high quality, and representative of real life objects, it would be prohibitively time consuming to 3D scan large numbers of objects.


I can think of a few ways that this can be useful.

  1. Aiding 3D artists. This is one of the most obvious ones. It can be used in any situation that photogrammetry can, but with a more streamlined pipeline. There are also situations that photogrammetry is not suitable; for example, photogrammetry does not deal well with transparent or reflective subjects. The method I have described can easily create training examples that contain reflective or transparent objects. A neural network trained successfully on such a dataset will be robust to objects of any material.
  2. Improve robustness of neural networks to new viewpoints. One disadvantage of convolutional neural networks is to do with the viewpoint of the training examples. For example, if someone trained a cat or dog classifier on photos that are all head on, it might not recognise a profile shot of a cat or a dog. A model with spacial understanding may be able to learn classifications that are more robust to pictures taken from angles not seen in the training set.I just wanted to share a quick thought. If you wanted to use this sort of dataset to improve robustness of a neural network to viewpoint variation, here is one way to do it. It would involve two neural nets – a “reconstructor” and a “classifier”. The “reconstructor” is a neural net that should be trained on the dataset above. In other words, it would take 2D images as inputs and learn to reconstruct their 3D structure as the output. This 3D structure data can be fed to an image classification network (the “classifier”). However, instead of taking images as inputs, it can take the output of the first neural net. In other words, it will learn to classify the 3D structure of objects rather than 2D images. This architecture will be inherently invariant to images taken from different points of view.
  3. Aiding scientists. I have a paper in the works that compares the accuracy of traditional photogrammetry with gold standard CT scanning. Though it’s in early stages, our initial findings are promising. For many models, the median error between out photogrammetry model and the ground truth is smaller than the resolution of the CT scanner. I would love to see how a model like this compares.
  4. Guard against adversarial examples. This one is purposefully last on the list, because I think it’s the least likely to hold true. I still thought it was worth sharing, because of the great difficulty, and potential dangers, that adversarial examples pose. Essentially, my thinking was that with a well designed dataset generator, and infinite computing power, you could train on hundreds of millions of training examples and far surpass the abilities of the any 3D modelling artist.It may be reasonable to think that this neural network would be more robust to adversarial examples, because of the ability to train it on arbitrarily large datasets and the ease of introducing noise into this dataset. The reason I doubt it now, is because the second neural network described in point 2 would not enjoy these same benefits. As such, an adversarial example would simply have to create a 3D structure that “tricks” the classifier network.

A note on licensing

I have created a script for Blender 3D that creates datasets, as described above. It is currently very rudimentary, and produces training examples that do not subjectively look like real photos. That said, if anyone would like to have a look at the code I am perfectly happy to open source it. Let me know in a comment below.

“Holistic Theme” – My Custom WordPress Theme for Clinics

I am responsible for the website of a local clinic, where I work part time. For about two years now, this has involved regularly maintaining a website which another agency designed. However, the recent acquisition of a new clinic premises has meant the need for a second, new, website.

Their existing website was so old that mobile wasn’t a concern on the designers minds. As such, an clumsy workaround meant that the experience on phones and tablets was far from ideal.

The Job

We saw this new acquisition as a great opportunity to kill two birds with one stone.

The clinic needed a new website, and the owners weren’t happy with the previous agency’s work. I volunteered to create a new custom theme, and the idea was to use this theme across both website.

The two clinics provide a very complementary healthcare service, and so I only found it fitting to call this theme “Holistic Theme”.

Holistic Theme

Because of the previous issues with mobile responsiveness, I designed the the new website from the ground up to work on mobile. This included a mobile first approach, as recommended by all modern guidelines.

After all, this would be the first website that I would design. As such, I didn’t want to pick up bad habits from the very beginning!

The theme would be easy to navigate on any device, and make it as easy as possible to get in touch or visit us. I therefore wanted to highlight the phone number (which was click-to-call on mobile) and our location, with prominent visibility besides our logo.

Our new website

The new clinic that was being acquired is called Alternatives Clinic. Check out the screenshot below, or visit it to get a feel for the responsiveness.

Reusing the template

Modifying the template to allow reuse took more work than I imagined (but to be fair, so do most things).

Some aspects, like the website title, were easy to change between sites. WordPress collects this information, which can be extracted easily with php.

Other aspects were much harder. For example, I wanted the same header design but different colours. The headers also needed to show different phone numbers, for the different lines.

After I found a solution (an excellent WordPress plugin called Pods) I managed to easily transfer the same theme for use on the old website.

Updating the old website

The website is Anana Clinic, and as you can see below, it used the same header design. Despite that, it had some important differences that would not have been possible without pods. For example, the colours, phone number and some font sizes were all different. In addition, this website does not have a logo so I found a way to only show the logo for the first website and not this one.

Track Visitor Scrolling in Google Analytics (Step by Step Beginner’s Guide)

As a newbie to Google Analytics, I’ve been struggling for the last week to implement scroll tracking to my website.

I messed around with lots of plugins, with varying levels of unsuccess.

Some were hard to understand, while others had dubious or nonexistent documentation.

It was a pain.

But eventually, I found a script that did exactly what I was looking for. It tracked key scroll points, at 25%, 75% and 100%, and was relatively easy to integrate with Google Analytics. The only issue was the out of date how-to guide, which meant some experimentation before I figured out exactly what I should be doing. It used some Google tracking features that don’t exist any more, and some others than have been completely renamed. I’ve created this guide for anyone looking to implement this from late 2017 onwards, and I’ve also mentioned every step to make it an absolute breeze to follow.

Before we start, let’s take care of some common questions:

Will I Have to Edit My Website’s Code?

If you have already installed Google Tag Manager, you won’t need to make any changes to your website’s code. Even if you haven’t, and need to add GTM, this can be done very easily.

If you’re running a WordPress site, you’ll just have to install this plugin. (Tip: the “experimental” code injection method performs great and I’ve never run into problems with it). If not, I recommend the official Google guide.

What Will I Need Before I Start?

There is one prerequisite for this guide:

This tutorial was designed to be as easy as possible to implement, so nothing else is required. I will go through the process step by step, so no prior Analytics or Tag Manager experience is required.

If any part of it is difficult to follow at all, just let me know below and I’ll do my best to fix it.

Hold on a Second. What even is Google Tag Manager?

Google Tag Manager (GTM) is a tool that simplifies modification of your website’s code. It is a single piece of code that you add to your website, and it will greatly reduce the number of changes you’ll need to make to your site’s code.

It does this by injecting pieces of code dynamically into your website, and you can control what it injects easily with an online interface. What I find particularly useful is that one change in Tag Manager will propagate to both my live site, and any local installations I have on my computer. No more need for changing the local installation, before spending an age to backup everything before I can transfer changes over to the live version of the site.

How Long Will it Take?

This should take 5 minutes if all goes smoothly. If it doesn’t, then God help us all.

I’m kidding. It’s easy 😊

Alright, Let’s Get Started

Step 1: Add the Tracking Code

First, we will add a piece of tracking code to our site. This can be done with Google Tag Manager, requiring no modification to our code.

OK first, open up GTM and select the website you want to edit:
The dashboard you see when logged in to Google Tag Manager. The first step to track scrolling google analytics. Even though this is tag manager we will move to Analytics soon

Now on the left hand side, click “tags” then hit the big red “NEW” button. You can see the button below:

This is what you should see:

At the top, click on “Untitled Tag” and name it something reasonable like “Scroll Tracking Code”.

Next, click anywhere on “Tag Configuration”. Scroll down and select “custom HTML” as the tag type.

You’ll notice a text box has opened up. Now copy all of the following code and paste it inside.

Now we need to make sure this tag fires when it’s ready. Underneath “Tag Configuration” you’ll see “Triggering”. If you click triggering, it will open up a new page. At the top right corner, you should see an arrow. Click the arrow in the top right:

Now click “Trigger Configuration” and select “DOM Ready”:

All you should do now is change “Untitled Trigger” to “DOM Ready”. It should look like this now:

You can now save your trigger and save your tag (top right corner).

Great, now our code is set up. All we have to do now is make sure it communicates with Google Analytics.

Step 2: Hook up the Tracking Code to Google Analytics

Since we’ll be integrating with Google Analytics, make sure you know your Google Analytics tracking number before you start this step.

Got it? Make sure you’re back at the “tags” page in GTM. Click “NEW” again to make a new tag. This new tag is all we need to integrate the scroll information with Google Analytics.

Like before, name it something useful like “Scrolling Analytics Integration”. Now hit “Tag Configuration” and select “Universal Analytics”. (If you are using Classic analytics this should still work the same, I just haven’t tested it personally).

Under “Track Type”, select “Event”:

You’ll see four text boxes open up, with the words Category, Action, Label and Value. These are all things that are passed to Analytics, so we’ll need to pull the scrolling information and put it into these variables. Luckily, this is quite easy. Let’s start with the first one: “Category”.

Setting up the “Category”

Click the little lego shaped icon next to the “Category” box. In the top right, you’ll see a plus sign. Click it:

Now click “Choose a variable type to begin setup…”. Scroll until you see “Data Layer Variable” and select it. You should find this under “Page Variables”.

At the top, name it something reasonable instead of “Untitled Variable”. I called it “Category Data Layer Variable”.

Simply change the “Data Layer Variable Name” to eventCategory. Change “Data Layer Version” to “Version 1” if it isn’t already.

This is what the complete variable will look like:
Save everything. When you finish, you should see this:

Now we will go through to fill in the other 4 boxes.

Setting up the “Action”

For “Action”, we want to track the Page URL it was triggered on. This will let us compare the performance of different pages, to see which content is being read.

This set up is simpler. Again, we will click the lego shaped icon and look for “Page URL”. If it is present, select it. If it isn’t there, click “BUILT-INS” in the top right corner, and check the “Page URL” box. This should now show, and you will be able to select it:

This is what everything should look like so far:

The next two boxes, “Label” and “Value”, will be set up very similarly to the first box, “Category”.

Setting up the “Label”

Like before, hit the lego icon, select the plus in the top right hand corner. Click “Choose a variable type to begin setup…” and select “Data Layer Variable”.

Like before, change “Untitled Variable” to something reasonable. I called it “Label Data Layer Variable”.

Simply change the “Data Layer Variable Name” to eventLabel. Change “Data Layer Version” to “Version 1” if it isn’t already.

Save everything. Progress picture so far:

Setting up the “Value”

Now for the last box, “Value”. We will set this up very similarly to the last one.

Click the lego icon and hit the plus in the top right hand corner. Click “Choose a variable type to begin setup…” and select “Data Layer Variable”.

Like before, change “Untitled Variable” to something reasonable like “Value Data Layer Variable”.

Now just change the “Data Layer Variable Name” to eventValue. Change “Data Layer Version” to “Version 1” if it isn’t already.

You don’t want scrolling to affect your bounce rate, so make sure “Non-interaction hit” is set to TRUE. You can see what this will look like, below.

The last setting in this Google Analytics tag is your Analytics ID. Under “Google Analytics Settings” select “New Variable” and input your Google Analytics Tracking ID. Name this something reasonable and save.

This is what it should all look like so far:

The only thing left is to trigger this code whenever someone scrolls. For this we need to create a trigger. Click “Choose a trigger to make this tag fire…” below everything we’ve just created. You should find it under “Triggers”.

Click it, and hit the plus arrow in the top right corner to create a “New Trigger”. At the top, name it “Custom Scroll Event” then hit “Choose a trigger type to begin setup…”. The type you want is “Custom Event”, right at the bottom under “Other”. You will see just one text box appear with the heading “Event Name”. Type in “scrollDistance”. The final trigger should look like this:

That’s it, save everything, and in your Google Tag Manager make sure you “Submit” your changes. They won’t take effect before you do this! You’ll have to write in some details about what changes you made.

Bonus Step: Make a Scroll “Goal”

If you’ve followed so far, your code should be triggering Google Analytics “events” whenever someone scrolls. This information by itself can be useful, because it shows how far people are scrolling on average. For example, look at the statistics from one website where I have implemented this code:

But we can make this even more useful and actionable. By triggering a Google Analytics “Goal” for key scroll levels we can do some really powerful things. I have used this trick with Google Optimize (an A/B testing tool) to create goals for improving my headlines. Essentially, I am testing multiple headline variations for my pages to see which ones lead to scrolls. In other words, I am testing to find the headlines that lead people to read my content.

Doing this is relatively straightforward.

Just go to your admin panel in Google Analytics. It looks like this:

Now click “Goals” and hit “+ NEW GOAL”. For step 1 of the setup, select “custom”. Click continue.

For step “2”, name it something reasonable and select the “Event” type. Click continue.

For the third and final stage, set Category equal to “Scroll Depth” and Label equal to the percentage that you want to track. I have set it to 100%, because I want to track how many people read the whole page:


You Can Now Track Scrolling in Google Analytics!

And that’s it! Thanks for reading, don’t hesitate to leave a comment down below if you have any questions at all.


In the world of Blender 3D, I go by Moby Motion

I’ve been making 3D animations for many years now, and I have a website for my Blender animations.

It’s the Moby Motion Blog. You can also see my animation work on tumblr and twitter.

That said, I will also be posting animation related content here if it doesn’t belong on my animation blog. For example, I have a lot to say about YouTube as a platform. I look forward to sharing my thoughts about what it takes to succeed on the platform.

What I plan to share

My 10,000,000 view on YouTube aren’t impressive in the grand scheme of things. However, to gain so many views in such a niche area has taken a lot of work. I’m now the largest channel in the physics simulation niche. Additionally, I’ve grown my channel to where it is now while studying medicine, and this has taken a huge conscious effort on my behalf to use my time effectively.

On this website, I plan to share the advice that has helped me grow my channel to where it is now. This will include advice on how to get YouTube views, how to turn those views into subscribers, how to monetise your YouTube audience, and more.

Look out for posts tagged YouTube, for more on this topic.