{"id":51,"date":"2011-10-20T00:26:54","date_gmt":"2011-10-20T05:26:54","guid":{"rendered":"https:\/\/www.falatic.com\/?p=51"},"modified":"2011-10-20T00:26:54","modified_gmt":"2011-10-20T05:26:54","slug":"thoughts-on-mobile-app-crash-detection-and-recovery","status":"publish","type":"post","link":"https:\/\/www.falatic.com\/index.php\/51\/thoughts-on-mobile-app-crash-detection-and-recovery","title":{"rendered":"Thoughts on mobile app crash detection and recovery"},"content":{"rendered":"<p style=\"text-align: left;\">As I work on updates to my app, I&#8217;m happy to see it&#8217;s working quite well. However, inevitably there will be some boundary case that causes an exception, and as I add features (and more importantly, persistent settings) an addition risk comes into play: failed restarts after a crash.<\/p>\n<p><!--more--><br \/>\nUnlike a desktop app, where one could add a &#8220;safe mode&#8221; icon for such instances, there are few options in the mobile space. These include:<\/p>\n<ul>\n<li>Letting the user delete the application data manually via the Settings menu<\/li>\n<li>Automatically resetting to defaults after a crash<\/li>\n<li>Attempting to automatically work around the broken area<\/li>\n<li>Asking the user on restart if they want to reset to defaults<\/li>\n<li>Falling back to the last known-good configuration<\/li>\n<\/ul>\n<p>It&#8217;s one thing to <em>be<\/em> a power user, but another to be <em>forced<\/em> to become one thus the first option isn&#8217;t really practical. The second option guarantees recovery but may degrade the user experience (especially if there are lots of settings involved)&#8230; it should be a last resort, and a user choice. The third option is nice, but can become quite complex: this works best if settings are checkpointed frequently, but such frequent saves to flash memory are not usually a great idea. In addition, the more complex the fail-safe, the more likely it may cause a triggering exception.<\/p>\n<p>Asking the user first is an important part of any solution, but it requires architecting a fail-safe startup sequence (one that is preference invariant). One can then offer the choice of a &#8220;fresh start&#8221; or the option of using a backed-up configuration, ensuring that the user has at least a chance of restoring to a previously working state with their settings intact.<\/p>\n<p>A useful pattern for a single-instance app would be the following, using a persistent <em>IsRunning<\/em> flag:<\/p>\n<ul>\n<li>App launches<\/li>\n<li>Load settings<\/li>\n<li>Check if the <em> IsRunning<\/em> flag is set: if it is, then the last run exited abnormally (none of the normal exit points were hit which would&#8217;ve cleared the flag):<\/li>\n<ul>\n<li>Clear the <em>IsRunning<\/em> flag<\/li>\n<li>Offer to restore from backup \/ defaults or attempt to continue as-is<\/li>\n<\/ul>\n<li>Initialize states and display initial UI presentation<\/li>\n<li>Save the current settings with a backup name<\/li>\n<li>Set the <em>IsRunning<\/em> flag<\/li>\n<li>Save the settings normally<\/li>\n<li>On normal app exit \/ backgrounding, clear the <em>IsRunning<\/em> flag and save settings normally<\/li>\n<\/ul>\n<p>There are other things you can do, including instrumenting for crash data collection and presenting the user an option to forward that data (anonymized!) to you for analysis.<\/p>\n<p><strong>Software quality is as much about how well you avoid defects in the first place as it is about how gracefully you recover from them!<sup>\u2020<\/sup><\/strong> Mistakes happen, especially in apps with complex GUIs. Errors should be as rare as possible, but when errors occur it&#8217;s the unrecoverable ones that lead to lost users. Solid error handling and recovery goes a long way towards satisfied users who stick around for the release that irons out the bug they encountered and worked around<\/p>\n<p><strong><sup>\u2020<\/sup><\/strong><em>This is a critical difference between hardware and software quality, especially for smaller shops. If you ship a physically defective device, the typical recourse is a return for repair or replacement &#8211; a costly proposition in terms of logistics as well as reworked hardware \/ waste. The quality patterns for software and hardware have commonalities but are generally quite different in practice: in general, hardware is by its very nature <\/em>far<em> more costly and less forgiving of quality failures. This is why popular hardware quality initiatives tend to translate poorly to the software space.<\/em><\/p>\n<!-- wpsso rrssb get buttons: buttons on archive option not enabled -->\n","protected":false},"excerpt":{"rendered":"<p>As I work on updates to my app, I&#8217;m happy to see it&#8217;s working quite well. However, inevitably there will be some boundary case that causes an exception, and as <a href=\"https:\/\/www.falatic.com\/index.php\/51\/thoughts-on-mobile-app-crash-detection-and-recovery\" class=\"more-link\">[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"Layout":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[20],"tags":[116,82,84,83],"class_list":["entry","author-marty","has-more-link","post-51","post","type-post","status-publish","format-standard","category-android","tag-android","tag-crash","tag-quality","tag-recovery"],"_links":{"self":[{"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/posts\/51","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/comments?post=51"}],"version-history":[{"count":0,"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/posts\/51\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/media?parent=51"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/categories?post=51"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.falatic.com\/index.php\/wp-json\/wp\/v2\/tags?post=51"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}