You are viewing...

On the idea of "Just Restart the Application"

Updated on December 10, 2019 at the 22th hour
Posted under:

DISCLAIMER: Expressed views on this blog are my own.

The Microsoft Windows days where programs would memory leak and perform illegal operations causing your OS to freak out and slow down and at some point you'd give up and restart the compute and now everything is back to normal ... I can say I don't miss at all. Thank god for better tools and development practices.

Now when this practice is used for networked applications in a datacenter where someone is on call for that application, well I can say if you worked for me.. you would be teetering on a line that separates you from low performer to does enough. Yes, I vehemently detest the idea of "just restart the application" with no solution in sight. The fact that one will allow an application to keep Oom-ing is telling on culture.

Any software engineer managing the application must to get down to business with memory heap analysis tools to figure out what is being allocated, what is allocating the memory and then not freeing it. Just leaving the application for weeks on end only to restart or to schedule restarts of clusters of applications because cannot find source of memory leak is more than enough cause to be classified as low performer. It is not a good experience for anyone involved. It is effectively chaos testing in the wild. Just no. Put whatever you are doing on hold, get out your detective hat and bust out the best in class memory analysis tools and stop accumulating power bills.

If it is late at night and you need to sleep, rollback whatever changes and then sleep and come back to the problem later. Sometimes rollback isn't simple as just push a button to go back to a previous version and that is unfortunate.

It is understandable that temporarily one will have to get paged for a while as the problem is actively being looked into. Anything more than that is an attribution of amateurism, lack of ownership and generally bad management. OOM is not a good sight.

You just read "On the idea of "Just Restart the Application"". Please share if you liked it!
You can read more recent posts here.