john | July 25, 2021, 6:22 a.m.
The hacker ethos states that that the sharing of information and data with others is an ethical imperative. I think many forget this. I mean I did. And I read Hackers by Steven Levy when I was just getting into coding.
It would be hard for me to deny that the narrative of that book didn't have an affect on me. It did. It was at the time a pretty decent grasp of everything I wanted coding to be. The thing is that life is a bit more real than a book.
But, it's spirit is not. Hence, why in my support of the open source initiative, I decided to make a project. It came from a email exchange I was having with a reporter.
I didn't want to comment on something, but felt sending a picture would be appropriate. It was, they found it amusing, and used it. But in making that photo, I said, "I hope they don't get my location from it" since the background had a window, some stucco, and a reflection of a tree.
Clearly enough to give away my location. So I sent it anyways, but that got me thinking. How easy is it to remove the background from an image.
Well of course, the first thing I did is search this. Saw there was a few people that solved this problem, and they charged what I thought was outrageous prices.
Oddly enough, there really wasn't that many people doing it either. Although from the looks of it, it did seem like something a lot of people also did use.
So as usual with most my dumb ideas, I found a reason to make my own. Of course, the people who do this already had a few years head start on me. And well, I'm slow.
Anyways, I went ahead and bought two books on the subject, picked up a linear algebra book and decided, it was time to start.
In the mean time as those books where being delivered, I decided it was time to actually code something. So I did what any sane person would do, and searched again to find out if anyone had spent some time solving this problem.
And low and behold I did, I found that u2net had a model with objects. This was cool, and there was a few people on GitHub that had posted some code of them applying this model.
Nothing was particularly well documented, or fast, but the model existed. So I decided to start messing around with it.
After some time and a bit more digging, I was able to get a decent project together.
By this point, I had just made a simple script, and connected it to a simple webapp and launched it on HackerNews.
To my surprise it got some traction, and ended up on the front page. In the comments some one asked if I was going to open source it.
Now I've launched a few dumb things here and there, but this one was hard. And it was dear to my heart, as basically it was difficult to achieve. So I said perhaps.
Then as I realized a bit more people liked this than I thought, I wondered if I could make it a bit cooler. And I thought if removing the background from an image was cool, removing the image from a video would be even cooler.
So as I was going piece by piece wondering how the heck I would get this tool to do that, I ran into some pretty easy questions for an FFmpeg expert, and as much as I have used it, I know I'm far from one of the best. So I basically sacrificed some of my StackOverflow points to get one of them to answer. They did.
By the time the first book arrived, building learning machine learning from scratch, I already had a nice working model of my video remover, that basically was going at one frame per second. Which, although amazing, still awful.
I then stumbled upon some one else's work, and their method was able to reach 1.2 frames per second. A 20% improvement. This seemed promising.
So I took that and started implementing it into what I had working. By this point, it was starting to look like a nice project. I was getting excited.
So then came the moment where I had the working script, laptop running full blast, with me feeling the heat as I typed on the keyboard as it removed the background from a video, and I had to decided, is this going to be open source?
Of course the answer was yes, since I have no actual utility for this, but I feel other people might. But as I was getting ready to launch it, it irked me that it took so long to do a video.
So as I was building the web app for it by adding in the video, I had to break up the command into a queue with different servers due to the GPU usage. And when I did this, by happenstance chance, I ran into a issue with PyTorch and multiprocessing.
I'm unsure if it was because of the version of Ubuntu or python, but in spending some time debugging that and fixing it, I realized I was importing the libraries wrong for multiprocessing, and fixed that as well.
In doing that, I got the frames per second up to a blazing 14. I was stoked. So then I decided to launch it on HackerNews, ProductHunt, and a few Reddit communities.
Although I must say to my surprise neither HackerNews nor ProductHunt liked the project this time, the Reddit communities all went nuts. It hit the top of r/web_design and r/sideproject.
And if you like it, share the site, and fork and star the repo. If it doesn't work, submit an issue, or submit a pull request.