All time most catastrophic career bloopers
I work at a Software company which provides software solutions to financial institutions, with Nigerian commercial banks as its most relevant customer base. My job there is as an application support personnel. It's a pretty interesting role, in that it allows me to exercise my people (customer) facing skills, while not totally pulling me out of doing techie stuff.
Since the 1+ year I've been there, I've been in charge of an enterprise application which is in use at about 10 commercial banks. The application allows the banks offer online payment collection services to website owners a.k.a. merchants. It was turbulent in this role for the first few months, but as the say, adversity breeds strength. So, right now, I'm the one to talk to when:
- The application crashes and is either partially or fully defunct;
- A new merchant is having technical issues, setting up their website to accept payments through the application;
- The bank needs an additional functionality, or some positive modification of an existing feature;
- There is a catastrophic event, due to an unprecedented malfunctioning of the application.
I really love what I do at my current job. I love how there is always a new challenge to figure out and provide a solution for everyday. Opening emails addressed to me gives me a semi-orgasm. Fixing things has always been a favorite pastime for me, so this job is more or less, a perfect fit in that regard.
More especially, I love how it has moulded me into a more compassionate and self-confident individual. When I meet with clients, which is often when they have problems with the application, I have to fully empathize with them, as well as exhibit confidence that all will be well in the best possible time. This has afforded me the chance to exercise my "thinking-on-my-feet" muscles to a point where I'm pretty good at it.
So much for a background, let's talk about the goddamned bloopers already.
Blooper 1
I had recently deployed an update to the application. The update was targeted at reducing the number of uncompleted transactions done on the platform. I was proud of the work, and I felt truly fulfilled after the implementation. Unfortunately, yours sincerely overlooked a particular corner case that turned out to be the source of a cardholder and merchant nightmare.
The problem was this:
The system keeps records of transactions as soon as the merchant sends the customer over for payment. One of its core functionalities is to keep track of these transactions from initiation to completion. This means the system must be able to provide, on demand and at any point in time, the accurate status of the transaction (successful, failed or not completed).
Now, after the update, while the system kept accurate track of transactions from merchants who bore bank fees for transactions, it failed to do so for those from merchants who transferred same fees to customers. I guess you can call it a punishment for greed.
Unfortunately, before we could notice this error which prevented accurate record-keeping, for quite a volume of transactions, about three weeks had passed. The major consequence of the error was that when customers' payments were successful, the transactions appeared in the system like they had not paid. This means affected merchants had no way of knowing if a payment was successful, and as such, were not able to deliver corresponding value to customers.
The realistic impact of my carelessness is as follows: customer pays (account gets debited), they return to merchant's site expecting to see a message that their purchase was successful. Shockingly, this is not the case, as my system keeps an inaccurate record, which does not reflect actual payment status. Customer panics and goes to own bank to report fraud, since as far as they're concerned, this is an attempt by merchant to rip them off of their hard earned cash. All because of my one mistake; customers suffers heartbreak, merchant suffers massive degradation in reputation with customers. This affair went on for three good weeks. Millions of Naira, locked down, all because of me.
Action Plan
Once discovered, I devoted all of my mental energy into cracking down on the issue. The steps I took were as follows:
- Trace down the bug which led to the error and fix it immediately;
- Mitigate damage done by attempting to get the accurate status of all affected transactions. I sincerely do not know how effective this was though. I strongly suspect not so much.
Lesson learned
After any major change to the application, do test transactions with both categories of merchants: the greedy and non-greedy. Be sure accurate status is being reflected at each stage of the transaction: from inception to consummation.
Blooper 2
A not well thought out change I recently made to the system led to another set of relatively catastrophic consequences at a client bank. Now, this change was not even really called for. When making the change, my intention was simply to correct a typographical error in a particular merchant interface. Apparently, I forgot the programming standard that says one can change implementations, but not interfaces.
Unfortunately for me, a high frequency web merchant has their website tightly integrated to the old interface, along with the typo. So, when we correct the typo, the integration just quietly broke. The effect? Merchant was unable to interpret response from the modified interface, which unfortunately, happened to be the interface through which the merchant checks my system to see whether or not, a customer's payment was successful. The effect of this on the merchant-customer relationship was identical to that of blooper 1.
Customers were paying in hordes by the minute. Yet, when it came time for merchant to transfer respective value to customer, an error occurred due to my blooper and caused this last leg of the process to fail. Come Monday (bug was introduced towards weekend), customers flocked in their hundreds to own banks to demand a refund of their monies (lay chargeback claims). And fire came raining upon me as well.
Thankfully, this error only lasted about four days in the system, many thanks to the soundness of the affected merchant's technical team. As a matter of fact, merchant turned off gateway after about two days of observing the anomaly. It's not usual when things like this happen that the merchant is able to provide a technical perspective on the source of the problem. Usually, all they can explain is the consequence. However, this particular merchant was able to show us how a correction of that typo was the root of the problem.
Action Plan
Correcting the consequences of this blooper was tons simpler than the first. The status of the transactions on my system was accurate. So all that we needed to do was get merchant's website updated with correct transaction statuses. Ideally, the responsibility of doing this lies with the merchant. However, since it became clear that problem was wholly caused by bank's application, the merchant was insistent on having the problem wholly resolved without her own active involvement.
In response, I created a small windows application to simulate the final leg of each of the affected successful transactions, so merchant could deliver value, howbeit belated.
I hear there were about 300 chargeback claims from affected customers in one day. You know how bad this was, when you consider that the average number of daily claims is 15. Many heartbroken customers, a merchant with dented reputation and forever lost customers.
Thankfully, that episode is over as quick as it began, and it's amazing that despite how severe it was, no other person at my company was aware. Typically, this kind of issue would have flown up to the big wigs at my company, due to the severity. I feel like with my conduct over time, I've been able to inspire high confidence in my powers in the minds of usually fretful bank staff. They believe me when I tell them we're going to do something. I work had to retain this image, by striving to keep with proposed deadlines. It's hard, I often fail at it, but somehow, they still trust me. Good thing for all of us.
Lesson learned
Never change anything in an application interface without duly notifying all current subscribers.
Conclusion
I wield super powers, with which I may sabotage a number of booming Nigerian web businesses far and wide. This makes me feel a deep sense of responsibility, and I'm usually neck deep in cleaning up my shit, whenever bad things happen due to my not being careful enough at work. I now fully grasp the fact that a lot of businesses reputations and livelihoods depend directly on the quality of my work. This drives all my activity at work.
grt
ReplyDelete