{"id":1459,"date":"2022-07-06T09:52:00","date_gmt":"2022-07-06T09:52:00","guid":{"rendered":"https:\/\/nag.com\/?post_type=insights&#038;p=983"},"modified":"2023-07-05T10:18:22","modified_gmt":"2023-07-05T10:18:22","slug":"ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism","status":"publish","type":"insights","link":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/","title":{"rendered":"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism"},"content":{"rendered":"<div class=\"container content-area-default \">\n    <div class=\"row justify-content--center\">\n        <div class=\"col-12 col-md-10 col-lg-8 col-xl-6\">\n            <div class=\"field field--name-field-paragraph-text field--type-text-long field--label-hidden field--item\">\n<p>This post is part of the\u00a0AD Myths Debunked series.<\/p>\n<p>You\u2019ve decided you want your parallel library or application to benefit from the advantages of Adjoint Automatic Differentiation (AAD), but you are concerned\u00a0that AAD may\u00a0not be thread safe, breaks vectorization, and doesn\u2019t play well with GPUs. How much of that is true? That is what we want to look at in today&#8217;s post.\u00a0<\/p>\n<p>AD is a semantic program transformation, which adds functionality to your original implementation. The transformed program not only computes the function values, e.g., the price of a financial instrument, but also the sensitivity or derivative thereof, e.g., the first-order Greeks. Using AAD for this has\u00a0an impact on parallelizability and, roughly speaking, that comes from the need to run the program from the outputs backwards through every single statement to the inputs. This in turn means that parallel reads of non thread-local data in the original program result in parallel writes to adjoint data in the reverse one. These parallel writes are a challenge since they potentially introduce race conditions. The straightforward way of resolving this, especially for parallel algorithms such as the Monte-Carlo method, is to duplicate the shared data. Together with dco\/c++\u2019s multi-threading support, the parallelism can then be restored and inherited to the AAD computation. Let\u2019s look at specific aspects of adjoints and parallelism in a little more detail.\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/p>\n<ul>\n<li>Vectorization.\u00a0<\/li>\n<\/ul>\n<p>With dco\/c++ the user needs to use a drop-in replacement for the used arithmetic type (e.g., replace \u2018double\u2019 with \u2018dco::ga1s&lt;double&gt;::type\u2019). This type comes with additional data attached to it, e.g., a tangent component for tangent\/forward mode AD, or a so-called \u2018tape index\u2019 for AAD. This naturally introduces a gap in the access pattern for the function values when performing operations on previously consecutive data. This has a negative impact on auto-vectorization. There are various solutions to this problem, though. dco\/c++ comes with a vector data type (dco::gv&lt;double, vector_size&gt;::type), which shifts the vectorization to a lower level. By doing so, the vectorization can be preserved and even inherited from the adjoint code. There are other approaches, such as the use of dco\/c++\u2019s specialized linear algebra libraries (e.g., BLAS or Eigen).\u00a0<span class=\"nag-n-override\" style=\"margin-left: 0 !important;\"><i>n<\/i><\/span>AG is also working on a new approach, which avoids the change of the data layout completely.\u00a0 \u00a0<\/p>\n<ul>\n<li>Multi-threading.\u00a0<\/li>\n<\/ul>\n<p>For AAD, dco\/c++ writes an image of your program into memory while executing the program. This image is called \u2018tape\u2019. For efficiency reasons, this is a single global tape. Multi-threaded code might therefore introduce data races on this global data. In order to avoid these, dco\/c++ comes with thread-safe tape storage. In addition, dco\/c++ has the \u2018multiple tape\u2019 support, i.e., each variable can be associated with a (thread-)locally allocated tape. Either of these features, depending on requirements, can be used to overcome the issue of writing into global data with multiple threads.\u00a0<\/p>\n<ul>\n<li>GPUs.\u00a0<\/li>\n<\/ul>\n<p>The last aspect we want to talk about today is AAD of a GPU code. This is inherently a difficult problem. Depending on the application, GPU adjoints can be achieved by either using higher-level libraries or by using specific AAD tools for GPUs, like dco\/map (\u2018map\u2019 stands for \u2018meta adjoint programming\u2019). Like GPU programming in the first place, getting efficient GPU adjoints is a hard task. Please let us know if you need an experienced partner at your side.\u00a0<\/p>\n<p><span class=\"nag-n-override\" style=\"margin-left: 0 !important;\"><i>n<\/i><\/span>AG\u2019s AD toolset has been developed and enhanced over the last 12 years and it builds upon a further 10 years of AD R&amp;D experience. We know that details matter. Applying AD to a parallel code base is part of our day-to-day work. \u00a0<\/p>\n<p>Myths are narratives, which might sound like truths, but by talking through these in some detail and sharing our experiences, we hope to help businesses navigate these issues. Results matter, myths should not. \u00a0<\/p>\n<\/div>\n        <\/div>\n    <\/div>\n<\/div>\n\n\n<div class=\"gbc-title-banner tac tac-lg tac-xl\" style='border-radius: 0px; '>\n    <div class=\"container\" style='border-radius: 0px; '>\n        <div class=\"row justify-content--center\" >\n            <div class=\"col-12\"  >\n                <div class=\"wrap pv-4 \" style=\"0pxbackground-color: \">\n                                <div class=\"col-12 col-md-10 col-lg-8 col-xl-6  banner-content\"  >\n    \n                    \n                    <div class=\"mt-1 mb-1 content\"><\/div>\n\n                    \n                    <a href='https:\/\/nag.com\/insights\/' style='background-color: #ff7d21ff; color: #ffffffff; border-radius: 30px; font-weight: 600; ' class='btn mr-1  ' >More Insights <i class='fas fa-angle-right'><\/i><\/a>                <\/div>\n                <\/div>\n            <\/div>\n        <\/div>\n    <\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>AD is a semantic program transformation, which adds functionality to your original implementation. The transformed program not only computes the function values, e.g., the price of a financial instrument, but also the sensitivity or derivative thereof, e.g., the first-order Greeks.<\/p>\n","protected":false},"author":5,"featured_media":988,"parent":0,"menu_order":0,"template":"","meta":{"content-type":"","footnotes":""},"post-tag":[16,25],"class_list":["post-1459","insights","type-insights","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG\" \/>\n<meta property=\"og:description\" content=\"AD is a semantic program transformation, which adds functionality to your original implementation. The transformed program not only computes the function values, e.g., the price of a financial instrument, but also the sensitivity or derivative thereof, e.g., the first-order Greeks.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/\" \/>\n<meta property=\"og:site_name\" content=\"nAG\" \/>\n<meta property=\"article:modified_time\" content=\"2023-07-05T10:18:22+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1-1024x576.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"576\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@NAGTalk\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/\",\"url\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/\",\"name\":\"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG\",\"isPartOf\":{\"@id\":\"https:\/\/nag.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png\",\"datePublished\":\"2022-07-06T09:52:00+00:00\",\"dateModified\":\"2023-07-05T10:18:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage\",\"url\":\"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png\",\"contentUrl\":\"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png\",\"width\":5333,\"height\":3000,\"caption\":\"Automatic Differentiation\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/nag.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Insights\",\"item\":\"https:\/\/nag.com\/insights\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/nag.com\/#website\",\"url\":\"https:\/\/nag.com\/\",\"name\":\"NAG\",\"description\":\"Robust, trusted numerical software and computational expertise.\",\"publisher\":{\"@id\":\"https:\/\/nag.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/nag.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/nag.com\/#organization\",\"name\":\"Numerical Algorithms Group\",\"alternateName\":\"NAG\",\"url\":\"https:\/\/nag.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/nag.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/nag.com\/wp-content\/uploads\/2023\/11\/NAG-Logo.png\",\"contentUrl\":\"https:\/\/nag.com\/wp-content\/uploads\/2023\/11\/NAG-Logo.png\",\"width\":1244,\"height\":397,\"caption\":\"Numerical Algorithms Group\"},\"image\":{\"@id\":\"https:\/\/nag.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/NAGTalk\",\"https:\/\/www.linkedin.com\/company\/nag\/\",\"https:\/\/www.youtube.com\/user\/NumericalAlgorithms\",\"https:\/\/github.com\/numericalalgorithmsgroup\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/","og_locale":"en_US","og_type":"article","og_title":"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG","og_description":"AD is a semantic program transformation, which adds functionality to your original implementation. The transformed program not only computes the function values, e.g., the price of a financial instrument, but also the sensitivity or derivative thereof, e.g., the first-order Greeks.","og_url":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/","og_site_name":"nAG","article_modified_time":"2023-07-05T10:18:22+00:00","og_image":[{"width":1024,"height":576,"url":"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1-1024x576.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_site":"@NAGTalk","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/","url":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/","name":"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism - nAG","isPartOf":{"@id":"https:\/\/nag.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage"},"image":{"@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage"},"thumbnailUrl":"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png","datePublished":"2022-07-06T09:52:00+00:00","dateModified":"2023-07-05T10:18:22+00:00","breadcrumb":{"@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#primaryimage","url":"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png","contentUrl":"https:\/\/nag.com\/wp-content\/uploads\/2023\/05\/Blog_Post-Myth-1.png","width":5333,"height":3000,"caption":"Automatic Differentiation"},{"@type":"BreadcrumbList","@id":"https:\/\/nag.com\/insights\/ad-myths-debunked-adjoint-automatic-differentiation-aad-will-destroy-parallelism\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/nag.com\/"},{"@type":"ListItem","position":2,"name":"Insights","item":"https:\/\/nag.com\/insights\/"},{"@type":"ListItem","position":3,"name":"AD Myths Debunked: Adjoint Automatic Differentiation (AAD) Will Destroy Parallelism"}]},{"@type":"WebSite","@id":"https:\/\/nag.com\/#website","url":"https:\/\/nag.com\/","name":"NAG","description":"Robust, trusted numerical software and computational expertise.","publisher":{"@id":"https:\/\/nag.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/nag.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/nag.com\/#organization","name":"Numerical Algorithms Group","alternateName":"NAG","url":"https:\/\/nag.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/nag.com\/#\/schema\/logo\/image\/","url":"https:\/\/nag.com\/wp-content\/uploads\/2023\/11\/NAG-Logo.png","contentUrl":"https:\/\/nag.com\/wp-content\/uploads\/2023\/11\/NAG-Logo.png","width":1244,"height":397,"caption":"Numerical Algorithms Group"},"image":{"@id":"https:\/\/nag.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/NAGTalk","https:\/\/www.linkedin.com\/company\/nag\/","https:\/\/www.youtube.com\/user\/NumericalAlgorithms","https:\/\/github.com\/numericalalgorithmsgroup"]}]}},"_links":{"self":[{"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/insights\/1459","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/insights"}],"about":[{"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/types\/insights"}],"author":[{"embeddable":true,"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/users\/5"}],"version-history":[{"count":5,"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/insights\/1459\/revisions"}],"predecessor-version":[{"id":2999,"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/insights\/1459\/revisions\/2999"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/media\/988"}],"wp:attachment":[{"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/media?parent=1459"}],"wp:term":[{"taxonomy":"post-tag","embeddable":true,"href":"https:\/\/nag.com\/wp-json\/wp\/v2\/post-tag?post=1459"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}