So, enough talking. What is it? What can it do? Well, let me just show you: WHAT IT DOESThis network can “morph” between any two images you give it. When you give it the right kinds of images, it does pretty good. But first, let me give you the backstory. I found the excellent ConvnetJs library, made by Andrej Karpathy. It’s pretty good if you want to train AI in Javascript. He made multiple demos showcasing the library’s capabilities, but the one I liked the most was Image Painter. Image painter is pretty simple. You give the network an image, and it tries to reproduce it. Here’s the architecture for the network: (again, sorry for the sloppiness) The network takes the x and y coordinates (positions) for each pixel in the image, and outputs the R,G,B values (how computers see color) for that pixel. It’s pretty satisfying seeing the network trying to reproduce the image. Some of you might be thinking, “What’s the point of trying to recreate the image if you already have the image?” And you all are right: it’s a pretty pointless task. But, if you’re really nerdy like me, you find these kinds of things entertaining. Then, I got thinking: What if you try to train a network on two images at once? Could you interpolate the images? And that’s how morphnet was born. The key is adding another input value to our network. This third input value will represent the image (1 for the first image, 2 for the second. I could have started at zero, but, I didn’t. Deal with it.) So, how do we morph the two images? Well, if 1 represents the first image, and 2 represents the second, then 1.5 should be somewhere in between the two. 1.1 should be 90% of the first one, and 10% percent of the second one. We don’t tell the network how to morph between the two (actually, I did a little, as you’ll see later), it decides how to morph them on its own. The first version of Morphnet was done pretty hastily (like this blog post). I literally had to use the javascript console to operate it. Later, I added in a UI (albeit not a fancy one) and cleaned up some code. Let’s go through all the changes I made to the network. THE DEVELOPMENTHere’s the first image morphing I did with the network: One thing you’ll notice: it’s very low resolution. The first change I made was to change the image resolution to something more acceptable. After that, it started to produce somewhat decent results. (Though, due to the low quality nature of Gifs, you might not be able to see it that well) But there was a problem: the neural network didn’t replicate the image exactly, even if it got really close. This didn’t matter for the interpolated parts, since those were the parts the network generated by itself. With the beginning and end, though, you could definitely tell something was off: My first attempt to solve this problem was to put the original images at the start and end of each morph. This kinda worked, but I could do better. To solve this problem, I faded the original images in and out at the ends, like this: You might notice the two morphs look different. Well, that's because the first one was done using an earlier version of the software. Once that was taken care of, I was pretty happy with the results. For a while, I left the network like it was, and started using it a lot. I did tweak a few things here and there, but for the most part, it didn’t change. Here’s another morph, because, why not. Also, note how it fades in and out at the end: Pretty quickly, I learned what worked well and what didn’t work well. Here’s some guidelines for images that the network likes:
Here’s another, to illustrate the point: However, there are some images that the network doesn’t like. For example, try these two images: How do you think the network will morph these? My ideal method would be to start off with a red shirt, transform into an orange shirt, and end as a yellow shirt. The, network, however, does it differently: It goes straight from red to yellow. After morphing images (and I’ve morphed a lot of images), I’ve concluded my neural network really doesn’t like to blend colors. So, I wrote a function to help it out. The function pretty much takes a pixel from the first image, the corresponding pixel from the second image, and blends them together. It then trains the network to output the blended color. Remember earlier when I said I was “cheating” a little? This is exactly what I was talking about. It doesn’t effect the network too much, because I use color blending sparingly. Most of the time, the network interpolates the image by itself. And, even with blending on, it still is terrible at blending colors. Here are the shirts again, with color blending on: You can’t really tell the difference. By this time, I had the network running on my phone, using Safari. At first, I used this, but this site was also used. Overall, I was satisfied with the results I was getting. However, I had yet to implement one of the most crucial changes to the network. THE LAST FEW CHANGESIn order to understand the change I made to the network, we first need to understand how coordinate systems work. Each point on a 2d plane has an X and Y value which represents its position on the plane. Every application that deals with images (pretty much all of them do nowadays) uses a coordinate system. And almost every coordinate system (on a computer) maps (X:0,Y:0) to the top left of the plane, like this: My network also worked like this for a while. It worked, but I noticed something odd. Take these two images (from spoonflower.com): One would assume that the two images would be equally easy for the network to reproduce. But, that’s not the case. Here’s the two images: As you can see, the network has an easier time with the one on the left than the one on the right. Why? I don’t know the exact reason, but I suspect that the coordinate system has something to do with it. So, I changed the system. Instead of the coordinate system being centered at the top left of the screen, it’s centered in the center. As a result, I’ve noticed an improvement in the quality of the morphing. Which means it’s time for another gif: (but i've seen the network do better) In the days leading up to Procjam 2017, I made some improvements to the UI. I tried experimenting with morphing more than one image at a time, but was unimpressed with the results. So, that’s pretty much the development of the network. I can imagine many of you asking, wait, what’s Procjam? Procjam is an annual jam hosted by Cut Current games. The idea is to “make something that makes something.” Considering my network makes an interpolation of two images, I think it qualifies. Thus, I will be submitting this to the jam. HOW TO USE ITIf you made it this far into the blog post, then I can only assume you want to try this thing out for yourself. Luckily, Morphnet can be found Here. I will warn you: the UI isn’t very beginner-friendly (or pretty). Once you get the hang of it, though, it’s not too bad. So, here’s how to use it. STEP 1: CHOOSING IMAGESNear the top, there should be two buttons that say (Choose file) on them. Click one. Once you click one, it should prompt you to upload an image. Select your image, and you should see the image at the top. Do the same for the other button. STEP TWO: TRAININGHit the checkbox that says “train” on it. It will begin training. If you want the neural net to periodically show you its progress, hit the checkbox labeled “update” (this can slow down training though). You can the train the network for as long or as little as you’d like. I recommend training for 5-8 minutes. Training for longer periods of time will make the network produce higher quality morphs. STEP THREE: THE RESULTSOnce you feel like the network has trained for long enough, deselect the checkbox that says “train” on it. Now, you’ll need to focus on these sets of buttons:
In case you missed it, I’ll put another link to the software here. While you’ve got the basics down, there are a few other things you should know about if you really want to use Morphnet like a pro. OFFSETTINGIn the article, I talked about offsetting. In order to offset your own images, you’ll need to make a few modifications to the steps:
You’ll notice the original images have been faded in and out of the animation. This should look a little nicer than before. RESETTING THE NETWORKTo reset, you could always refresh the page. Although, there’s a way to reset that doesn’t involve refreshing: Click (Reset) to make the network forget everything it learned. Choose new images by clicking the corresponding (Choose input) buttons. Train it like normal. Now you know pretty much everything you need to know. CONCLUSION:There have been many attempts to put neural networks on the web. Arguably the most famous example is of Google’s Quick, Draw! experiment that tries to guess player’s doodles. While things like that are cool, I think they don’t show much of what’s going on behind the scenes, and they don’t give the user much flexibility. Speaking of showing you what’s behind the scenes, this is normally where I’d post a link to the source code. Unfortunately to anyone that actually wanted to see the code, my parents have decided against letting me have a Github account. That still doesn’t mean people can’t learn from this project, and have some fun. Ultimately, I hope this inspires some of you to learn more about artificial intelligence. And, in case you missed the first two links to my software, here’s another one. Bye.
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
AuthorSee my about page Archives
April 2018
Categories |