jay's old blog

this blog will be deleted soon - please visit my new blog - https://thesanguinetechtrainer.com

Error and Logging

[Ongoing series of blog posts to inform potential developers, users and (hopefully investors) about this new app ecosystem I am architecting, designing, developing and deploying. More details at this page]

A system, no matter how good it, should be capable of giving the right communication. This applies to people too. If there are two individuals, one of them an excellent work person with poor communication skills, and the other with average skills but excellent communication. On my best day or the worst day, I will always go with the person who has excellent communication skills (and medium work skills). That’s because a person with good communication is the one who will keep me informed about the good and the bad stuff, no matter how comfortable/uncomfortable the truth is.

On the outset, this all seems like common sense. However, in all the years I have been working, common sense is perhaps the rarest of commodities. People do stupid things, and then hide behind a veil of ego and false claims and losing touch with reality. That is why, as I continue to work on my app ecosystem, I realized that it is essential that it has a proper error and logging system.

From a strictly programming perspective, error and logs, both do the exact same thing. When something happens as expected, it is ‘logged’. When something unexpected happens, it is also logged, but shows up as a ‘error’. For instance, when a user signs in successfully, that is a ‘log’. A user signs in but it does happen for some reason, that would be an ‘error’.

A collection of logs is useful for making decisions related to performance optimization. For instance, if every user in the system is taking 10 seconds to sign in, and I discover that there is a new library that allows signs in to happen in 5 seconds, I know I need to implement that. There by saving 5 seconds for every user on the system. Logs are more of a proactive measure at making a system better. We use the collection of, the collection of (two repeated collections, not typo here), logs to find out what about the system can be improved.

A collection of errors is useful for finding out what is going wrong. This, unlike the logs, are about reacting to something that is going wrong. Let’s use the sign in example. If 10% of users are having difficulty signing in, then that is a problem. A problem that needs to be fixed as soon as possible, and the system should automatically trigger when the error percentage rises. Inform the operation folks and established documentation should also be automatically triggered allowing emergency response to start working.

In fact, if possible, the system should not only diagnose the issue on its own but also, if possible, fix it on its own. I could be throwing darts in the dark here but I think this is what machine learning is all about. The machine learns from its own mistakes and then starts doing things. Like say, fix problems with no intervention from anybody. I think this is practical and even possible. By my own experience, mistakes are like wheels. Some wheels need to be invented, while others have already been invented, and hence need not be reinvented.

Let’s say, the system logs an error system for the first time. This is the first time, so a human is involved in fixing it. Once the issue has been fixed, programming is done so that next time, when the same error is triggered, the system will attempt this fix. That means, no human need to be involved. The human can now focus on fixing new issues, instead of reinventing a wheel that already has been wheeled out last time. Machines are good at repetitive tasks, and that means, if something has been repeated, I don’t quite understand why humans should be involved? The time spent by the human re-fixing what has already been fixed could be used for something else. That time could be spent fixing new issues. That time could be spent improving the system, like making it faster or better. For all I know, the human can use the same time to watch a movie or fall asleep on the couch or go for a walk. The idea is to avoid reinventing the wheel, let the machine do what is already done and the human do something new and exciting.

So yes, my app ecosystem will have a machine learning enabled error and logging system.

Follow me on twitter, facebook and instagram for more updates. Thanks!

Comments are closed