Category: software

Installing Vagrant on Windows 10

Published on: 29.12.2019

I am a fan of CLI packet managers (because they save me time).
On Windows Chocolatey
is the only option, so I tried to install Vagrant with VirtualBox as hypervisor with it.

How difficult can it be?
More then I expected.

Instalment of Vagrant and VirtualBox look easy:

Init and up:

Have we have first problems:

After googling, one possible solution is to set the provider explicitly, not a big problem, altho I like when tech just works.

Try again:

Then next again the problem:

Try my luck with older version 6.0 of VirtualBox:

Let try again:

Finally working.

As you can see now just vagrant up is working without the need to set provider.

Mock time in Python unit tests

Published on: 27.11.2019

From my point of view, unit tests are useful, especially for refactoring functions that have clearly defined inputs and outputs.

One thing that is impossible to test is anything regarding time/date code because whenever you run it is a different time 🙂 .

For example, if you have a function that calculates the number of days between two dates, the only way to test is to somehow mock current time/date.

in Python, there is FreezeGun.

It is implemented as a decorator, which is handy and readable in code.

So far I have not found any problem with FreezeGun.

qgrid is an interactive grid for Jupyter notebook

Published on: 01.10.2019

One useful tool for doing exploratory data analysis in Jupyter notebook is qgrid.

Qgrid is an interactive grid for sorting, filtering, and editing Pandas DataFrames in Jupyter notebooks.

When I first saw qgrid, I did not understand the use of it.
Everything that qgrid can do I can do in regular Pandas code.

When I started doing exploratory data analysis, after some time I notice that I spend most of my time sorting and filtering data to understand/find the connection between it, changing code, pressing SHIFT+ENTER and waiting for the result.

With qgrid changing code and pressing SHIFT+ENTER steps are removed, changing filters/sorting in qgrid requires less mental energy than writing code and it is less error-prone.

Qgrid makes the whole process less time and energy-consuming, basically, I am increasing productivity with qgrid.

If you do any exploratory data analysis my suggestion is to try it.

Qgrid offers much more than just interactive sort/filter for Pandas DataFrames, like events, for more information, there is a nice 1h video and documentation.

HTTP ping

Published on: 01.08.2019

ping is a great tool to check if some computer/server/router/camera (any IP addressable device) is online.

For me, it is useful when restarting the virtual machine and using ping with -t flag for constant checking when a machine is online so that I can check if everything is working as it should.

Just to get some context, at that time one of my responsibility was for 120 virtual machines in one organization.

Some virtual machine where web servers, although ping was useful for me to know when a machine is back online, but would be more useful is to know when the web-page is online.

That got me thing, that it would be great if I had some kind of ping for HTTP.

Luckily it exists and it is called httping.

If you use Chocolatey installation is easy with choco install httping.

From my experience httping is working fine, as expected.

Only issues that I have found is that often web server is online, even a few minutes before the web-page that I need is online.

This is not a problem of httping, normally web-server is online before the web-page, especially is web-page is some complex web application.

Next step would be to find some httping tool with grep capability.

I could build that toll myself probably in a few hours (with Python), but currently, there is no cost-benefit for my use-case.

If you know some tool with those features, feel free to leave a comment.

Windows package manager

Published on: 01.06.2019

Chocolatey is a decent package manager for Windows OS, it is usable and has an updated list of packages.

Package manager provides a way to install/uninstall the software from CLI.

If you are not a CLI user, Chocolatey package manager is a good way to start using CLI.

What is the benefit of CLI package manager software?

For me personally, the benefit is in spending less time in operations of installing, updating and uninstalling software.

The traditional workflow of installing software is usually the following:

  1. open web a browser
  2. search for webpage from where the software can be downloaded (just this can take a long time)
  3. open the desired webpage and search for from where to download software
  4. then initial download
  5. after the download is completed, start the installation and do “Next-Next-Finish”
  6. delete the original installation file

As you can see there are 6 steps in the workflow minimally.

With CLI package manager all, I need to do in CLI is to type choco install SOFTWARE and everything else is done automatically.

This is more productive.

If you want GUI there is also a package for that.

The first thing to do before you start to write code

Published on: 01.05.2019

Let me start with a personal true story.

In 2014 I was working on a personal project of making IOS card game called Tablic, I spend more than 200 work hours on did 80% of the original plan and never finished it.

Do you know why?

Because of software scope creep.

Originally I had the idea to make a game with one player vs. computer.

As I was near completing that goal I started adding additional features: I could add 4 players, a user could select AI difficulty.

Each additional features meant more development time: new code and changes to an existing one.

And after all this, I decided to add multiplayer support over local WiFi and over the internet.

This last decision has truly killed the project completely.

In order to implement multiplayer, I had to write a lot of additional code and to change existing architecture, it was at least an additional 300 hours and after more than 200 hours already spend I decided to take a small break (like a week) but later never continued.

The reason why I did not want to release it without multiplayer was that I was thinking it is not good enough, other games had multiplayer, how can I make one without it.

I was always thinking that I will continue and finish it one day, but that day never came, today I am thinking that it is probably better to rewrite it to Unity than to continue in Objective-C (but that is a discussion for another time).

Years later when I was analyzing why I never finished that IOS game, I came to the conclusion that the original problem was because I did not have a specification of the first version.

By specification, I just mean a list of feature, with dependencies between them, basic UI scatch and time estimates.

As I was completing some features I continue adding new ones indefinitely.

I am pretty sure if I did implement multiplayer I would add some other features also.

Today I am wiser or I just think so.

Now I have a process for writing software.

Before I write code I decide what is MVP that I will make, without even thinking about additional features.

The reason why I do not even want to write addition features is that I have learned that even when I make software for myself, software what I make is not software that I need.

Define what kind of minimal features you need to have in your software, make dependencies between them and time estimates how long it will take to make them.

I do time estimates in pomodoros (25 minutes increments) but other time units can be used.

You can not understand the solution until you had the problem

Published on: 01.04.2019

I understand RESTful web services, or at least I think I do.

I agree that when you have huge teams and code base it makes sense to cut them in small independent pieces and connect them via queues and HTTP.

Collaboration on large software projects is hard and problems are increasing exponentially with a number of people added to the project.

The tradeoff is that the overall speed of your software will decrease (because of HTTP networks calls), but you will get software system that can be maintained and new features added without the need of understanding/changing/impacting whole system.

But I never found a use case for myself as somebody who is one man team working on his own projects.

Until one morning.

Architecture

I have a lot of (around 10) independent software programs that are running on daily (some even every hour) interval.

Most of them are doing some variation of web scraping, storing, analysis and reporting of results via email.

This was all fine until one morning I woke up and saw there where was no emails from my software.

I know that something was not right.

They all use yagmail for sending email, so I was thinking that there is some problem with that, because it is a single point of failure.

After an investigation, I found out that the problem was with Gmail itself it just stopped working, the next day it was fine, so they just had some issue that they need one day to resolve (I am not talking about Gmail web page, but with SMTP username/password authentication).

Why Gmail

Why do I use free Gmail for sending an email and not some more reliable service like SendGrid or Amazon SES?

That is a nice lecture in technical debt, in essence, what was a good idea for an initial requirement, as time progress and requirements or circumstances change, it is not so good idea anymore.

When I started with my first project in development as proof of concept Gmail was an excellent choice: easy to start and working fine.

As the project moved to deployment an additional projects where made it was easy to copy/paste the existing code than to refactor/redesign/rearchitect existing working solution.

REST solution

Emails did not work one day for me and after one day everything was back to normal.

I started to think about what can I do to avoid this problem in the future.

One solution would be to change from Gmail to something else. but here are a few issues that I do not like.

First issues

What if other solution (email provider) stops working in the future, I would again need to write new code for the third solution.

To fix this problem my idea is to use Gmail as primary providers for email sending if email sending fails I will just use a secondary email provider.

With this logic, I can add the third one also and so on, but I think that two are enough for the first version.

Second issues

Currently, I have around 10 apps (and this number will increase with new apps that I plan to do in future) that need email sending, each has a separate code base repository.

If I want to change something in email logic, even something simple ae username/password I need to do same change it in 10 different code bases.

One solution is to make one code base just for sending emails, this would solve the problem of the same changes in multiple apps.

But in order to work, I need to change the folder structure all my apps, update paths in the code bases, and I can use this only if all apps are in the same machine hosting.

If they are on separate machines it will not work.

REST to the rescue

After understanding all the difficulties, making RESTful web services just for email sending made total sense to me.

The only reason why it made sense to me is that I have a use case where REST is useful and look like the only solution.

The first version will just be adapter/facade around yagmail with REST API, but that is a story for another time.

Verification vs Validation in practice

Published on: 01.03.2019

Verification is the process of checking that the software meets the specification.

It is doing what you wanted it to do.

An example could be that function need to add two numbers, then you verify (like write unit test) that it is correctly doing that.

Validation is the process of checking whether the specification captures the customer’s needs.

Using the example of the function need to add two numbers verification need to confirm that this function is really what user need, eg. maybe you need to multiply two numbers.

Practice vs theory

When I first head about validation I was able to understand it in theory, but in practice, I was thinking it is easy to know what you want why do you need to validate it.

Then I had personal experience of why and how validation is hard.

I build wrong software for myself and I had no one else to blame.

How I build wrong software for myself

My idea was to make software that will be run at 1 AM every day, will take all real-estate ads from https://www.njuskalo.hr/ for my town listed on a previous day, sort them by price for a square meter and send them to email.

Basically, I wanted all new ads per day to my email, sorted by price for a square (one day delay was fine for me).

Looks simple enough, what could go wrong?

After a few days and I had it running in production and it was working, verification was successful, I every day I got all ads from the previous day.

Why validation was wrong

After a week I found out that my software was useless.

What was the problem?

Rember that I wanted to get “I wanted all new ads per day to my email”, I wanted all “new ads per day”, but what I got was all updated and new ads per day.

Let me explain.

Every day I was getting around 200 ads per day and I noticed that a lot of them were the same ads, day after day.

What was happening is that a lot of people were just updating the same ad every day.

And they are doing this so that their ad is always on the first page, sometimes they even do it a few times per day (later I found out that a friend of a friend was contracted by one local real-estate agency to make software that will automatically update ads for them).

Altho my software was working correctly, only after I have made it I found out it is useless because of wrong assumptions.

My assumption was that every ad will be added only once, not that 60% of adds will be updated every week.

I have solved this problem by making version two that could know if an ad is new or updated and if updated what was updated.

Am I stupid

This experience was fascinating to me.

On this project, I was everything: user, project manager, architect, coder, quality assurance, investor, every hat was on my had and I manage to build the wrong thing.

It gave me a practical understanding of why it is common that the end user is not happy with the finaly product.

Even if everything is done correctly it is possible that the final product is not solving user original problem due to wrong initial assumptions.

How to improve validation

One approach is to make MVP, in this way you will spend fewer resources on version one.

If validation of MVP is correct, then add additional features, if not cancel it

Another approach is to get some domain knowledge ether internal or external.

I had built a few web-scrapers in the last few years and now know a few tricks about that domain, but I learned each on the hard way.

I also understood why some companies hire domain experts consultants (just be sure to have a good one).

Technology

For those interested in what tools did use to build my software here is the list: Scrapy, dataset, yagmail.

The most important lesson for new programmers

Published on: 15.01.2019

On my “Sending email from Python” blog post what was cross-published on Medium I got a comment asking how to send an email via outlook programmatically.

My first reaction what that this is some troll or bot.

So I did “Let me google that for you” answer and later got “Thank you” response.

That got me thinking, maybe he was not an internet troll, maybe he just does not know how to google.

It never crosses his mind that he can ask google for the answer.

Why I was thinking that somebody was trolling me

I am an experienced (15+ years) software developer, I am experienced because I know that when I do not know something first I google it, that I search on youtube and the last resort is to ask StackOverflow.

This is what professionals do, they do not ask questions on random blogs in hope that somebody will respond.

Learn how to google

For beginners learning to code, best what you can do for yourself (and other) is to learn to google what you do not know.

Today it is easier to learn coding than 20 years ago when I was starting.

At my time the only thing that you had was a book (if you were lucky).

Today there are much more opportunities to learn:

  • you have Youtube today what is the largest free video learning tool
  • google for asking
  • and StackOverflow communities where you can ask questions

Be aware that you should not ask a question specific to your particular coding problem, just bring it to a more abstract level.

Tips on googling

From my experience, it is important to know which keywords to google.

But if you do not know keywords you can always start with “how to …..”.

Any action is better than no action.

Most programmers are financial morons

Published on: 01.01.2019

Let me start with one true story from the year 2011.

At that time I was working as a software programmer (90% C++) in a team of 5 people.

One morning, a friend from team started showing cool new source code editor called Sublime Text.

He was very happy with it, he used it on the job, for his own pet projects, and for his freelancing side jobs for almost few months.

But for him Sublime Text had one drawback, he had to pay 100$ for it (at time of this writing Sublime Text license is 80$, but I think that at that time it was 100$, but I could be wrong).

At that time I know that my friend is a financial moron.

I tried to explain to him, using same logic like in this blog post, but he just could not get it, he only understood that he has to spend money.

Why most programmers are financial morons

Let say that he was only using Sublime Text every second day (altho, knowing him it was probably every day).

With every second day assumption that is 182 day per year.

He was happy with a new tool, it was better for him, so let us say that he got 10 extra minutes of work every day.

10 minutes times 182 days is 30 hours of work more per year.

To get a break even he would need to make 3,33$ per hour of work.

Even at that time, he was charging his freelance rate at 20$ per hour and he had around 5 hours of billable work hours per week.

He is a smart guy, but he was thinking that toll is expensive.

Economically speaking, he does not know how to do a cost-benefit analysis.

It is strange how logically intelligent programmers (believe me, you do have to be logically intelligent to write computer programs) never invest in tools that basically have ROI in days.

Conclusion

Do cost benefit analysis before saying that something is expensive or cheap.

Disclaimer:
I have no interest do you use or buy Sublime Text or not, I am just using it as an example.