Amazon Web Services Blog

New – Amazon Comprehend Medical Adds Ontology Linking

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights in unstructured text. It is very easy to use, with no machine learning experience required. You can customize Comprehend for your specific use case, for example creating custom document classifiers to organize your documents into your own categories, or custom entity types that analyze text for your specific terms. However, medical terminology can be very complex and specific to the healthcare domain. For this reason, we introduced last year Amazon Comprehend Medical, a HIPAA eligible natural language processing service that makes it easy to use machine learning to extract relevant medical information from unstructured text. Using Comprehend Medical, you can quickly and accurately gather information, such as medical condition, medication, dosage, strength, and frequency from a variety of sources like doctors’ notes, clinical trial reports, and patient health records. Today, we are adding the capability of linking the information extracted by Comprehend Medical to medical ontologies. An ontology provides a declarative model of a domain that defines and represents the concepts existing in that domain, their attributes, and the relationships between them. It is typically represented as a knowledge base, and made available to applications that need to use or share knowledge. Within health informatics, an ontology is a formal description of a health-related domain. The ontologies supported by Comprehend Medical are: ICD-10-CM, to identify medical conditions as entities and link related information such as diagnosis, severity, and anatomical distinctions as attributes of that entity. This is a diagnosis code set that is very useful for population health analytics, and for getting payments from insurance companies based on medical services rendered. RxNorm, to identify medications as entities and link attributes such as dose, frequency, strength, and route of administration to that entity. Healthcare providers use these concepts to enable use cases like medication reconciliation, which is is the process of creating the most accurate list possible of all medications a patient is taking. For each ontology, Comprehend Medical returns a ranked list of potential matches. You can use confidence scores to decide which matches make sense, or what might need further review. Let’s see how this works with an example. Using Ontology Linking In the Comprehend Medical console, I start by giving some unstructured, doctor notes in input: At first, I use some functionalities that were already available in Comprehend Medical to detect medical and protected health information (PHI) entities. Among the recognized entities (see this post for more info) there are some symptoms and medications. Medications are recognized as generics or brands. Let’s see how we can connect some of these entities to more specific concepts. I use the new features to link those entities to RxNorm concepts for medications. In the text, only the parts mentioning medications are detected. In the details of the answer, I see more information. For example, let’s look at one of the detected medications: The first occurrence of the term “Clonidine” (in second line in the input text above) is linked to the generic concept (on the left in the image below) in the RxNorm ontology. The second occurrence of the term “Clonidine” (in the fourth line in the input text above) is followed by an explicit dosage, and is linked to a more prescriptive format that includes dosage (on the right in the image below) in the RxNorm ontology. To look for for medical conditions using ICD-10-CM concepts, I am giving a different input: The idea again is to link the detected entities, like symptoms and diagnoses, to specific concepts. As expected, diagnoses and symptoms are recognized as entities. In the detailed results those entities are linked to the medical conditions in the ICD-10-CM ontology. For example, the two main diagnoses described in the input text are the top results, and specific concepts in the ontology are inferred by Comprehend Medical, each with its own score. In production, you can use Comprehend Medical via API, to integrate these functionalities with your processing workflow. All the screenshots above render visually the structured information returned by the API in JSON format. For example, this is the result of detecting medications (RxNorm concepts): { "Entities": [ { "Id": 0, "Text": "Clonidine", "Category": "MEDICATION", "Type": "GENERIC_NAME", "Score": 0.9933062195777893, "BeginOffset": 83, "EndOffset": 92, "Attributes": [], "Traits": [], "RxNormConcepts": [ { "Description": "Clonidine", "Code": "2599", "Score": 0.9148101806640625 }, { "Description": "168 HR Clonidine 0.00417 MG/HR Transdermal System", "Code": "998671", "Score": 0.8215734958648682 }, { "Description": "Clonidine Hydrochloride 0.025 MG Oral Tablet", "Code": "892791", "Score": 0.7519310116767883 }, { "Description": "10 ML Clonidine Hydrochloride 0.5 MG/ML Injection", "Code": "884225", "Score": 0.7171697020530701 }, { "Description": "Clonidine Hydrochloride 0.2 MG Oral Tablet", "Code": "884185", "Score": 0.6776907444000244 } ] }, { "Id": 1, "Text": "Vyvanse", "Category": "MEDICATION", "Type": "BRAND_NAME", "Score": 0.9995427131652832, "BeginOffset": 148, "EndOffset": 155, "Attributes": [ { "Type": "DOSAGE", "Score": 0.9910679459571838, "RelationshipScore": 0.9999822378158569, "Id": 2, "BeginOffset": 156, "EndOffset": 162, "Text": "50 mgs", "Traits": [] }, { "Type": "ROUTE_OR_MODE", "Score": 0.9997182488441467, "RelationshipScore": 0.9993833303451538, "Id": 3, "BeginOffset": 163, "EndOffset": 165, "Text": "po", "Traits": [] }, { "Type": "FREQUENCY", "Score": 0.983681321144104, "RelationshipScore": 0.9999642372131348, "Id": 4, "BeginOffset": 166, "EndOffset": 184, "Text": "at breakfast daily", "Traits": [] } ], "Traits": [], "RxNormConcepts": [ { "Description": "lisdexamfetamine dimesylate 50 MG Oral Capsule [Vyvanse]", "Code": "854852", "Score": 0.8883932828903198 }, { "Description": "lisdexamfetamine dimesylate 50 MG Chewable Tablet [Vyvanse]", "Code": "1871469", "Score": 0.7482635378837585 }, { "Description": "Vyvanse", "Code": "711043", "Score": 0.7041242122650146 }, { "Description": "lisdexamfetamine dimesylate 70 MG Oral Capsule [Vyvanse]", "Code": "854844", "Score": 0.23675969243049622 }, { "Description": "lisdexamfetamine dimesylate 60 MG Oral Capsule [Vyvanse]", "Code": "854848", "Score": 0.14077001810073853 } ] }, { "Id": 5, "Text": "Clonidine", "Category": "MEDICATION", "Type": "GENERIC_NAME", "Score": 0.9982216954231262, "BeginOffset": 199, "EndOffset": 208, "Attributes": [ { "Type": "STRENGTH", "Score": 0.7696017026901245, "RelationshipScore": 0.9999960660934448, "Id": 6, "BeginOffset": 209, "EndOffset": 216, "Text": "0.2 mgs", "Traits": [] }, { "Type": "DOSAGE", "Score": 0.777644693851471, "RelationshipScore": 0.9999927282333374, "Id": 7, "BeginOffset": 220, "EndOffset": 236, "Text": "1 and 1 / 2 tabs", "Traits": [] }, { "Type": "ROUTE_OR_MODE", "Score": 0.9981689453125, "RelationshipScore": 0.999950647354126, "Id": 8, "BeginOffset": 237, "EndOffset": 239, "Text": "po", "Traits": [] }, { "Type": "FREQUENCY", "Score": 0.99753737449646, "RelationshipScore": 0.9999889135360718, "Id": 9, "BeginOffset": 240, "EndOffset": 243, "Text": "qhs", "Traits": [] } ], "Traits": [], "RxNormConcepts": [ { "Description": "Clonidine Hydrochloride 0.2 MG Oral Tablet", "Code": "884185", "Score": 0.9600071907043457 }, { "Description": "Clonidine Hydrochloride 0.025 MG Oral Tablet", "Code": "892791", "Score": 0.8955953121185303 }, { "Description": "24 HR Clonidine Hydrochloride 0.2 MG Extended Release Oral Tablet", "Code": "885880", "Score": 0.8706559538841248 }, { "Description": "12 HR Clonidine Hydrochloride 0.2 MG Extended Release Oral Tablet", "Code": "1013937", "Score": 0.786146879196167 }, { "Description": "Chlorthalidone 15 MG / Clonidine Hydrochloride 0.2 MG Oral Tablet", "Code": "884198", "Score": 0.601354718208313 } ] } ], "ModelVersion": "0.0.0" } Similarly, this is the output when detecting medical conditions (ICD-10-CM concepts): { "Entities": [ { "Id": 0, "Text": "coronary artery disease", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9933860898017883, "BeginOffset": 90, "EndOffset": 113, "Attributes": [], "Traits": [ { "Name": "DIAGNOSIS", "Score": 0.9682672023773193 } ], "ICD10CMConcepts": [ { "Description": "Atherosclerotic heart disease of native coronary artery without angina pectoris", "Code": "I25.10", "Score": 0.8199513554573059 }, { "Description": "Atherosclerotic heart disease of native coronary artery", "Code": "I25.1", "Score": 0.4950370192527771 }, { "Description": "Old myocardial infarction", "Code": "I25.2", "Score": 0.18753206729888916 }, { "Description": "Atherosclerotic heart disease of native coronary artery with unstable angina pectoris", "Code": "I25.110", "Score": 0.16535982489585876 }, { "Description": "Atherosclerotic heart disease of native coronary artery with unspecified angina pectoris", "Code": "I25.119", "Score": 0.15222692489624023 } ] }, { "Id": 2, "Text": "atrial fibrillation", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9923409223556519, "BeginOffset": 116, "EndOffset": 135, "Attributes": [], "Traits": [ { "Name": "DIAGNOSIS", "Score": 0.9708861708641052 } ], "ICD10CMConcepts": [ { "Description": "Unspecified atrial fibrillation", "Code": "I48.91", "Score": 0.7011875510215759 }, { "Description": "Chronic atrial fibrillation", "Code": "I48.2", "Score": 0.28612759709358215 }, { "Description": "Paroxysmal atrial fibrillation", "Code": "I48.0", "Score": 0.21157972514629364 }, { "Description": "Persistent atrial fibrillation", "Code": "I48.1", "Score": 0.16996538639068604 }, { "Description": "Atrial premature depolarization", "Code": "I49.1", "Score": 0.16715925931930542 } ] }, { "Id": 3, "Text": "hypertension", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9993137121200562, "BeginOffset": 138, "EndOffset": 150, "Attributes": [], "Traits": [ { "Name": "DIAGNOSIS", "Score": 0.9734011888504028 } ], "ICD10CMConcepts": [ { "Description": "Essential (primary) hypertension", "Code": "I10", "Score": 0.6827990412712097 }, { "Description": "Hypertensive heart disease without heart failure", "Code": "I11.9", "Score": 0.09846580773591995 }, { "Description": "Hypertensive heart disease with heart failure", "Code": "I11.0", "Score": 0.09182810038328171 }, { "Description": "Pulmonary hypertension, unspecified", "Code": "I27.20", "Score": 0.0866364985704422 }, { "Description": "Primary pulmonary hypertension", "Code": "I27.0", "Score": 0.07662317156791687 } ] }, { "Id": 4, "Text": "hyperlipidemia", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9998835325241089, "BeginOffset": 153, "EndOffset": 167, "Attributes": [], "Traits": [ { "Name": "DIAGNOSIS", "Score": 0.9702492356300354 } ], "ICD10CMConcepts": [ { "Description": "Hyperlipidemia, unspecified", "Code": "E78.5", "Score": 0.8378056883811951 }, { "Description": "Disorders of lipoprotein metabolism and other lipidemias", "Code": "E78", "Score": 0.20186281204223633 }, { "Description": "Lipid storage disorder, unspecified", "Code": "E75.6", "Score": 0.18514418601989746 }, { "Description": "Pure hyperglyceridemia", "Code": "E78.1", "Score": 0.1438658982515335 }, { "Description": "Other hyperlipidemia", "Code": "E78.49", "Score": 0.13983778655529022 } ] }, { "Id": 5, "Text": "chills", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9989762306213379, "BeginOffset": 211, "EndOffset": 217, "Attributes": [], "Traits": [ { "Name": "SYMPTOM", "Score": 0.9510533213615417 } ], "ICD10CMConcepts": [ { "Description": "Chills (without fever)", "Code": "R68.83", "Score": 0.7460958361625671 }, { "Description": "Fever, unspecified", "Code": "R50.9", "Score": 0.11848161369562149 }, { "Description": "Typhus fever, unspecified", "Code": "A75.9", "Score": 0.07497859001159668 }, { "Description": "Neutropenia, unspecified", "Code": "D70.9", "Score": 0.07332006841897964 }, { "Description": "Lassa fever", "Code": "A96.2", "Score": 0.0721040666103363 } ] }, { "Id": 6, "Text": "nausea", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9993392825126648, "BeginOffset": 220, "EndOffset": 226, "Attributes": [], "Traits": [ { "Name": "SYMPTOM", "Score": 0.9175007939338684 } ], "ICD10CMConcepts": [ { "Description": "Nausea", "Code": "R11.0", "Score": 0.7333012819290161 }, { "Description": "Nausea with vomiting, unspecified", "Code": "R11.2", "Score": 0.20183530449867249 }, { "Description": "Hematemesis", "Code": "K92.0", "Score": 0.1203150525689125 }, { "Description": "Vomiting, unspecified", "Code": "R11.10", "Score": 0.11658868193626404 }, { "Description": "Nausea and vomiting", "Code": "R11", "Score": 0.11535880714654922 } ] }, { "Id": 8, "Text": "flank pain", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9315784573554993, "BeginOffset": 235, "EndOffset": 245, "Attributes": [ { "Type": "ACUITY", "Score": 0.9809532761573792, "RelationshipScore": 0.9999837875366211, "Id": 7, "BeginOffset": 229, "EndOffset": 234, "Text": "acute", "Traits": [] } ], "Traits": [ { "Name": "SYMPTOM", "Score": 0.8182812929153442 } ], "ICD10CMConcepts": [ { "Description": "Unspecified abdominal pain", "Code": "R10.9", "Score": 0.4959934949874878 }, { "Description": "Generalized abdominal pain", "Code": "R10.84", "Score": 0.12332479655742645 }, { "Description": "Lower abdominal pain, unspecified", "Code": "R10.30", "Score": 0.08319114148616791 }, { "Description": "Upper abdominal pain, unspecified", "Code": "R10.10", "Score": 0.08275411278009415 }, { "Description": "Jaw pain", "Code": "R68.84", "Score": 0.07797083258628845 } ] }, { "Id": 10, "Text": "numbness", "Category": "MEDICAL_CONDITION", "Type": "DX_NAME", "Score": 0.9659366011619568, "BeginOffset": 255, "EndOffset": 263, "Attributes": [ { "Type": "SYSTEM_ORGAN_SITE", "Score": 0.9976192116737366, "RelationshipScore": 0.9999089241027832, "Id": 11, "BeginOffset": 271, "EndOffset": 274, "Text": "leg", "Traits": [] } ], "Traits": [ { "Name": "SYMPTOM", "Score": 0.7310190796852112 } ], "ICD10CMConcepts": [ { "Description": "Anesthesia of skin", "Code": "R20.0", "Score": 0.767346203327179 }, { "Description": "Paresthesia of skin", "Code": "R20.2", "Score": 0.13602739572525024 }, { "Description": "Other complications of anesthesia", "Code": "T88.59", "Score": 0.09990577399730682 }, { "Description": "Hypothermia following anesthesia", "Code": "T88.51", "Score": 0.09953102469444275 }, { "Description": "Disorder of the skin and subcutaneous tissue, unspecified", "Code": "L98.9", "Score": 0.08736388385295868 } ] } ], "ModelVersion": "0.0.0" } Available Now You can use Amazon Comprehend Medical via the console, AWS Command Line Interface (CLI), or AWS SDKs. With Comprehend Medical, you pay only for what you use. You are charged based on the amount of text processed on a monthly basis, depending on the features you use. For more information, please see the Comprehend Medical section in the Comprehend Pricing page. Ontology Linking is available in all regions were Amazon Comprehend Medical is offered, as described in the AWS Regions Table. The new ontology linking APIs make it easy to detect medications and medical conditions in unstructured clinical text and link them to RxNorm and ICD-10-CM codes respectively. This new feature can help you reduce the cost, time and effort of processing large amounts of unstructured medical text with high accuracy. — Danilo

AWS Links & Updates – Monday, December 9, 2019

With re:Invent 2019 behind me, I have a fairly light blogging load for the rest of the month. I do, however, have a collection of late-breaking news and links that I want to share while they are still hot out of the oven! AWS Online Tech Talks for December – We have 18 tech talks scheduled for the remainder of the month. You can lean about Running Kubernetes on AWS Fargate, What’s New with AWS IoT, Transforming Healthcare with AI, and much more! AWS Outposts: Ordering and Installation Overview – This video walks you through the process of ordering and installing an Outposts rack. You will learn about the physical, electrical, and network requirements, and you will get to see an actual install first-hand. NFL Digital Athlete – We have partnered with the NFL to use data and analytics to co-develop the Digital Athlete, a platform that aims to improve player safety & treatment, and to predict & prevent injury. Watch the video in this tweet to learn more: AWS JPL Open Source Rover Challenge – Build and train a reinforcement learning (RL) model on AWS to autonomously drive JPL’s Open-Source Rover between given locations in a simulated Mars environment with the least amount of energy consumption and risk of damage. To learn more, visit the web site or watch the Launchpad Video. Map for Machine Learning on AWS – My colleague Julien Simon created an awesome map that categories all of the ML and AI services. The map covers applied ML, SageMaker’s built-in environments, ML for text, ML for any data, ML for speech, ML for images & video, fraud detection, personalization & recommendation, and time series. The linked article contains a scaled-down version of the image; the original version is best! Verified Author Badges for Serverless App Repository – The authors of applications in the Serverless Application Repository can now apply for a Verified Author badge that will appear next to the author’s name on the application card and the detail page. Cloud Innovation Centers – We announced that we will open three more Cloud Innovation Centers in 2020 (one in Australia and two in Bahrain), bringing the global total to eleven. Machine Learning Embark – This new program is designed to help companies transform their development teams into machine learning practitioners. It is based on our own internal experience, and will help to address and overcome common challenges in the machine learning journey. Read the blog post to learn more. Enjoy! — Jeff;

Check out The Amazon Builders’ Library – This is How We Do It!

Amazon customers often tell us that they want to know more about how we build and run our business. On the retail side, they tour Amazon Fulfillment Centers and see how we we organize our warehouses. Corporate customers often ask about our Leadership Principles, and sometimes adopt (and then adapt) them for their own use. I regularly speak with customers in our Executive Briefing Center (EBC), and talk to them about working backwards, PRFAQs, narratives, bar-raising, accepting failure as part of long-term success, and our culture of innovation. The same curiosity that surrounds our business surrounds our development culture. We are often asked how we design, build, measure, run, and scale the hardware and software systems that underlie Amazon.com, AWS, and our other businesses. New Builders’ Library Today I am happy to announce The Amazon Builders’ Library. We are launching with a collection of detailed articles that will tell you exactly how we build and run our systems, each one written by the senior technical leaders who have deep expertise in that part of our business. This library is designed to give you direct access to the theory and the practices that underlie our work. Students, developers, dev managers, architects, and CTOs will all find this content to be helpful. This is the content that is “not sold in stores” and not taught in school! The library is organized by category: Architecture – The design decisions that we make when designing a cloud service that help us to optimize for security, durability, high availability, and performance. Software Delivery & Operations – The process of releasing new software to the cloud and maintaining health & high availability thereafter. Inside the Library I took a quick look at two of the articles while writing this post, and learned a lot! Avoiding insurmountable queue backlogs – Principal Engineer David Yanacek explores the ins and outs of message queues, exploring the benefits and the risks, including many of the failure modes that can arise. He talks about how queues are used to power AWS Lambda and AWS IoT Core, and describes the sophisticated strategies that are used to maintain responsive and to implement (in his words) “magical resource isolation.” David shares multiple patterns that are used to create asynchronous multitenant systems that are resilient, including use of multiple queues, shuffle sharding, delay queues, back-pressure, and more. Challenges with distributed systems – Senior Principal Engineer Jacob Gabrielson discusses they many ways that distributed systems can fail. After defining three distinct types (offline, soft real-time, and hard real-time) of systems, he uses an analogy with Bizarro to explain why hard real-time systems are (again, in his words) “frankly, a bit on the evil side.” Building on an example based on Pac-Man, he adds some request/reply communication and enumerates all of the ways that it can succeed or fail. He discussed fate sharing and how it can be used to reduce the number of test cases, and also talks about many of the other difficulties that come with testing distributed systems. These are just two of the articles; be sure to check out the entire collection. More to Come We’ve got a lot more content in the pipeline, and we are also interested in your stories. Please feel free to leave feedback on this post, and we’ll be in touch. — Jeff;  

AWS Launches & Previews at re:Invent 2019 – Wednesday, December 4th

Here’s what we announced today: Amplify DataStore – This is a persistent, on-device storage repository that will help you to synchronize data across devices and to handle offline operations. It can be used as a standalone local datastore for web and mobile applications that have no connection to the cloud or an AWS account. When used with a cloud backend, it transparently synchronizes data with AWS AppSync. Amplify iOS and Amplify Android – These open source libraries enable you can build scalable and secure mobile applications. You can easily add analytics, AI/ML, API (GraphQL and REST), datastore, and storage functionality to your mobile and web applications. The use case-centric libraries provide a declarative interface that enables you to programmatically apply best practices with abstractions. The libraries, along with the Amplify CLI, a toolchain to create, integrate, and manage the cloud services used by your applications, are part of the Amplify Framework. Amazon Neptune Workbench – You can now query your graphs from within the Neptune Console using either Gremlin or SPARQL queries. You get a fully managed, interactive development environment that supports live code and narrative text within Jupyter notebooks. In addition to queries, the notebooks support bulk loading, query planning, and query profiling. To get started, visit the Neptune Console. Amazon Chime Meetings App for Slack – This new app allows Slack users to start and join Amazon Chime online meetings from their Slack workspace channels and conversations. Slack users that are new to Amazon Chime will be auto-registered with Chime when they use the app for the first time, and can get access to all of the benefits of Amazon Chime meetings from their Slack workspace. Administrators of Slack workspaces can install the Amazon Chime Meetings App for Slack from the Slack App Directory. To learn more, visit this blog post. HTTP APIs for Amazon API Gateway in Preview – This is a new API Gateway feature that will let you build cost-effective, high-performance RESTful APIs for serverless workloads using Lambda functions and other services with an HTTP endpoint. HTTP APIs are optimized for performance—they offer the core functionality of API Gateway at a cost savings of up to 70% compared to REST APIs in API Gateway. You will be able to create routes that map to multiple disparate backends, define & apply authentication and authorization to routes, set up rate limiting, and use custom domains to route requests to the APIs. Visit this blog post to get started. Windows gMSA Support in ECS – Amazon Elastic Container Service (ECS) now supports Windows group Managed Service Account (gMSA), a new capability that allows you to authenticate and authorize your ECS-powered Windows containers with network resources using an Active Directory (AD). You can now easily use Integrated Windows Authentication with your Windows containers on ECS to secure services. — Jeff;  

Amplify DataStore – Simplify Development of Offline Apps with GraphQL

The open source Amplify Framework is a command line tool and a library allowing web & mobile developers to easily provision and access cloud based services. For example, if I want to create a GraphQL API for my mobile application, I use amplify add api on my development machine to configure the backend API. After answering a few questions, I type amplify push to create an AWS AppSync API backend in the cloud. Amplify generates code allowing my app to easily access the newly created API. Amplify supports popular web frameworks, such as Angular, React, and Vue. It also supports mobile applications developed with React Native, Swift for iOS, or Java for Android. If you want to learn more about how to use Amplify for your mobile applications, feel free to attend one the workshops (iOS or React Native) we prepared for the re:Invent 2019 conference. AWS customers told us the most difficult tasks when developing web & mobile applications is to synchronize data across devices and to handle offline operations. Ideally, when a device is offline, your customers should be able to continue to use your application, not only to access data but also to create and modify them. When the device comes back online, the application must reconnect to the backend, synchronize the data and resolve conflicts, if any. It requires a lot of undifferentiated code to correctly handle all edge cases, even when using AWS AppSync SDK’s on-device cache with offline mutations and delta sync. Today, we are introducing Amplify DataStore, a persistent on-device storage repository for developers to write, read, and observe changes to data. Amplify DataStore allows developers to write apps leveraging distributed data without writing additional code for offline or online scenario. Amplify DataStore can be used as a stand-alone local datastore in web and mobile applications, with no connection to the cloud, or the need to have an AWS Account. However, when used with a cloud backend, Amplify DataStore transparently synchronizes data with an AWS AppSync API when network connectivity is available. Amplify DataStore automatically versions data, implements conflict detection and resolution in the cloud using AppSync. The toolchain also generates object definitions for my programming language based on the GraphQL schema developers provide. Let’s see how it works. I first install the Amplify CLI and create a React App. This is standard React, you can find the script on my git repo. I add Amplify DataStore to the app with npx amplify-app. npx is specific for NodeJS, Amplify DataStore also integrates with native mobile toolchains, such as the Gradle plugin for Android Studio and CocoaPods that creates custom XCode build phases for iOS. Now that the scaffolding of my app is done, I add a GraphQL schema representing two entities: Posts and Comments on these posts. I install the dependencies and use AWS Amplify CLI to generate the source code for the objects defined in the GraphQL schema. # add a graphql schema to amplify/backend/api/amplifyDatasource/schema.graphql echo "enum PostStatus { ACTIVE INACTIVE } type Post @model { id: ID! title: String! comments: [Comment] @connection(name: "PostComments") rating: Int! status: PostStatus! } type Comment @model { id: ID! content: String post: Post @connection(name: "PostComments") }" > amplify/backend/api/amplifyDatasource/schema.graphql # install dependencies npm i @aws-amplify/core @aws-amplify/DataStore @aws-amplify/pubsub # generate the source code representing the model npm run amplify-modelgen # create the API in the cloud npm run amplify-push @model and @connection are directives that the Amplify GraphQL Transformer uses to generate code. Objects annotated with @model are top level objects in your API, they are stored in DynamoDB, you can make them searchable, version them or restrict their access to authorised users only. @connection allows to express 1-n relationships between objects, similarly to what you would define when using a relational database (you can use the @key directive to model n-n relationships). The last step is to create the React app itself. I propose to download a very simple sample app to get started quickly: # download a simple react app curl -o src/App.js https://raw.githubusercontent.com/sebsto/amplify-datastore-js-e2e/master/src/App.js # start the app npm run start I connect my browser to the app http://localhost:8080and start to test the app. The demo app provides a basic UI (as you can guess, I am not a graphic designer !) to create, query, and to delete items. Amplify DataStore provides developers with an easy to use API to store, query and delete data. Read and write are propagated in the background to your AppSync endpoint in the cloud. Amplify DataStore uses a local data store via a storage adapter, we ship IndexedDB for web and SQLite for mobile. Amplify DataStore is open source, you can add support for other database, if needed. From a code perspective, interacting with data is as easy as invoking the save(), delete(), or query() operations on the DataStore object (this is a Javascript example, you would write similar code for Swift or Java). Notice that the query() operation accepts filters based on Predicates expressions, such as item.rating("gt", 4) or Predicates.All. function onCreate() { DataStore.save( new Post({ title: `New title ${Date.now()}`, rating: 1, status: PostStatus.ACTIVE }) ); } function onDeleteAll() { DataStore.delete(Post, Predicates.ALL); } async function onQuery(setPosts) { const posts = await DataStore.query(Post, c => c.rating("gt", 4)); setPosts(posts) } async function listPosts(setPosts) { const posts = await DataStore.query(Post, Predicates.ALL); setPosts(posts); } I connect to Amazon DynamoDB console and observe the items are stored in my backend: There is nothing to change in my code to support offline mode. To simulate offline mode, I turn off my wifi. I add two items in the app and turn on the wifi again. The app continues to operate as usual while offline. The only noticeable change is the _version field is not updated while offline, as it is populated by the backend. When the network is back, Amplify DataStore transparently synchronizes with the backend. I verify there are 5 items now in DynamoDB (the table name is different for each deployment, be sure to adjust the name for your table below): aws dynamodb scan --table-name Post-raherug3frfibkwsuzphkexewa-amplify \ --filter-expression "#deleted <> :value" \ --expression-attribute-names '{"#deleted" : "_deleted"}' \ --expression-attribute-values '{":value" : { "BOOL": true} }' \ --query "Count" 5 // <= there are now 5 non deleted items in the table ! Amplify DataStore leverages GraphQL subscriptions to keep track of changes that happen on the backend. Your customers can modify the data from another device and Amplify DataStore takes care of synchronizing the local data store transparently. No GraphQL knowledge is required, Amplify DataStore takes care of the low level GraphQL API calls for you automatically. Real-time data, connections, scalability, fan-out and broadcasting are all handled by the Amplify client and AppSync, using WebSocket protocol under the cover. We are effectively using GraphQL as a network protocol to dynamically transform model instances to GraphQL documents over HTTPS. To refresh the UI when a change happens on the backend, I add the following code in the useEffect() React hook. It uses the DataStore.observe() method to register a callback function ( msg => { ... } ). Amplify DataStore calls this function when an instance of Post changes on the backend. const subscription = DataStore.observe(Post).subscribe(msg => { console.log(msg.model, msg.opType, msg.element); listPosts(setPosts); }); Now, I open the AppSync console. I query existing Posts to retrieve a Post ID. query ListPost { listPosts(limit: 10) { items { id title status rating _version } } } I choose the first post in my app, the one starting with 7d8… and I send the following GraphQL mutation: mutation UpdatePost { updatePost(input: { id: "7d80688f-898d-4fb6-a632-8cbe060b9691" title: "updated title 13:56" status: ACTIVE rating: 7 _version: 1 }) { id title status rating _lastChangedAt _version _deleted } } Immediately, I see the app receiving the notification and refreshing its user interface. Finally, I test with multiple devices. I first create a hosting environment for my app using amplify add hosting and amplify publish. Once the app is published, I open the iOS Simulator and Chrome side by side. Both apps initially display the same list of items. I create new items in both apps and observe the apps refreshing their UI in near real time. At the end of my test, I delete all items. I verify there are no more items in DynamoDB (the table name is different for each deployment, be sure to adjust the name for your table below): aws dynamodb scan --table-name Post-raherug3frfibkwsuzphkexewa-amplify \ --filter-expression "#deleted <> :value" \ --expression-attribute-names '{"#deleted" : "_deleted"}' \ --expression-attribute-values '{":value" : { "BOOL": true} }' \ --query "Count" 0 // <= all the items have been deleted ! When syncing local data with the backend, AWS AppSync keeps track of version numbers to detect conflicts. When there is a conflict, the default resolution strategy is to automerge the changes on the backend. Automerge is an easy strategy to resolve conflit without writing client-side code. For example, let’s pretend I have an initial Post, and Bob & Alice update the post at the same time: The original item: { "_version": 1, "id": "25", "rating": 6, "status": "ACTIVE", "title": "DataStore is Available" } Alice updates the rating: { "_version": 2, "id": "25", "rating": 10, "status": "ACTIVE", "title": "DataStore is Available" } At the same time, Bob updates the title: { "_version": 2, "id": "25", "rating": 6, "status": "ACTIVE", "title": "DataStore is great !" } The final item after auto-merge is: { "_version": 3, "id": "25", "rating": 10, "status": "ACTIVE", "title": "DataStore is great !" } Automerge strictly defines merging rules at field level, based on type information defined in the GraphQL schema. For example List and Map are merged, and conflicting updates on scalars (such as numbers and strings) preserve the value existing on the server. Developers can chose other conflict resolution strategies: optimistic concurrency (conflicting updates are rejected) or custom (an AWS Lambda function is called to decide what version is the correct one). You can choose the conflit resolution strategy with amplify update api. You can read more about these different strategies in the AppSync documentation. The full source code for this demo is available on my git repository. The app has less than 100 lines of code, 20% being just UI related. Notice that I did not write a single line of GraphQL code, everything happens in the Amplify DataStore. Your Amplify DataStore cloud backend is available in all AWS Regions where AppSync is available, which, at the time I write this post are: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (London). There is no additional charges to use Amplify DataStore in your application, you only pay for the backend resources you use, such as AppSync and DynamoDB (see here and here for the pricing detail). Both services have a free tier allowing you to discover and to experiment for free. Amplify DataStore allows you to focus on the business value of your apps, instead of writing undifferentiated code. I can’t wait to discover the great applications you’re going to build with it. -- seb

AWS Launches & Previews at re:Invent 2019 – Tuesday, December 3rd

Whew, what a day. This post contains a summary of the announcements that we made today. Launch Blog Posts Here are detailed blog posts for the launches: AWS Outposts Now Available – Order Your Racks Today! Inf1 Instances with AWS Inferentia Chips for High Performance Cost-Effective Inferencing. EBS Direct APIs – Programmatic Access to EBS Snapshot Content. AWS Compute Optimizer – Your Customized Resource Optimization Service. Amazon EKS on AWS Fargate Now Generally Available. AWS Fargate Spot Now Generally Available. ECS Cluster Auto Scaling is Now Generally Available. Easily Manage Shared Data Sets with Amazon S3 Access Points. Amazon Redshift Update – Next-Generation Compute Instances and Managed, Analytics-Optimized Storage. Amazon Redshift – Data Lake Export and Federated Queries. Amazon Rekognition Custom Labels. Amazon SageMaker Studio: The First Fully Integrated Development Environment For Machine Learning. Amazon SageMaker Model Monitor – Fully Managed Automatic Monitoring For Your Machine Learning Models. Amazon SageMaker Experiments – Organize, Track And Compare Your Machine Learning Trainings. Amazon SageMaker Debugger – Debug Your Machine Learning Models. Amazon SageMaker Autopilot – Automatically Create High-Quality Machine Learning Models. Now Available on Amazon SageMaker: The Deep Graph Library. Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation. Deep Java Library (DJL). AWS Now Available from a Local Zone in Los Angeles. Lambda Provisioned Concurrency. AWS Step Functions Express Workflows: High Performance & Low Cost. AWS Transit Gateway – Build Global Networks and Centralize Monitoring Using Network Manager. AWS Transit Gateway Adds Multicast and Inter-regional Peering. VPC Ingress Routing – Simplifying Integration of Third-Party Appliances. Amazon Chime Meeting Regions. Other Launches Here’s an overview of some launches that did not get a blog post. I’ve linked to the What’s New or product information pages instead: EBS-Optimized Bandwidth Increase – Thanks to improvements to the Nitro system, all newly launched C5/C5d/C5n/C5dn, M5/M5d/M5n/M5dn, R5/R5d/R5n/R5dn, and P3dn instances will support 36% higher EBS-optimized instance bandwidth, up to 19 Gbps. In addition newly launched High Memory instances (6, 9, 12 TB) will also support 19 Gbps of EBS-optimized instance bandwidth, a 36% increase from 14Gbps. For details on each size, read more about Amazon EBS-Optimized Instances. EC2 Capacity Providers – You will have additional control over how your applications use compute capacity within EC2 Auto Scaling Groups and when using AWS Fargate. You get an abstraction layer that lets you make late binding decisions on capacity, including the ability to choose how much Spot capacity that you would like to use. Read the What’s New to learn more. Previews Here’s an overview of the previews that we revealed today, along with links that will let you sign up and/or learn more (most of these were in Andy’s keynote): AWS Wavelength – AWS infrastructure deployments that embed AWS compute and storage services within the telecommunications providers’ datacenters at the edge of the 5G network to provide developers the ability to build applications that serve end-users with single-digit millisecond latencies. You will be able to extend your existing VPC to a Wavelength Zone and then make use of EC2, EBS, ECS, EKS, IAM, CloudFormation, Auto Scaling, and other services. This low-latency access to AWS will enable the next generation of mobile gaming, AR/VR, security, and video processing applications. To learn more, visit the AWS Wavelength page. Amazon Managed Apache Cassandra Service (MCS) – This is a scalable, highly available, and managed Apache Cassandra-compatible database service. Amazon Managed Cassandra Service is serverless, so you pay for only the resources you use and the service automatically scales tables up and down in response to application traffic. You can build applications that serve thousands of requests per second with virtually unlimited throughput and storage. To learn more, read New – Amazon Managed Apache Cassandra Service (MCS). Graviton2-Powered EC2 Instances – New Arm-based general purpose, compute-optimized, and memory-optimized EC2 instances powered by the new Graviton2 processor. The instances offer a significant performance benefit over the 5th generation (M5, C5, and R5) instances, and also raise the bar on security. To learn more, read Coming Soon – Graviton2-Powered General Purpose, Compute-Optimized, & Memory-Optimized EC2 Instances. AWS Nitro Enclaves – AWS Nitro Enclaves will let you create isolated compute environments to further protect and securely process highly sensitive data such as personally identifiable information (PII), healthcare, financial, and intellectual property data within your Amazon EC2 instances. Nitro Enclaves uses the same Nitro Hypervisor technology that provides CPU and memory isolation for EC2 instances. To learn more, visit the Nitro Enclaves page. The Nitro Enclaves preview is coming soon and you can sign up now. Amazon Detective – This service will help you to analyze and visualize security data at scale. You will be able to quickly identify the root causes of potential security issues or suspicious activities. It automatically collects log data from your AWS resources and uses machine learning, statistical analysis, and graph theory to build a linked set of data that will accelerate your security investigation. Amazon Detective can scale to process terabytes of log data and trillions of events. Sign up for the Amazon Detective Preview. Amazon Fraud Detector – This service makes it easy for you to identify potential fraud that is associated with online activities. It uses machine learning and incorporates 20 years of fraud detection expertise from AWS and Amazon.com, allowing you to catch fraud faster than ever before. You can create a fraud detection model with a few clicks, and detect fraud related to new accounts, guest checkout, abuse of try-before-you-buy, and (coming soon) online payments. To learn more, visit the Amazon Fraud Detector page. Amazon Kendra – This is a highly accurate and easy to use enterprise search service that is powered by machine learning. It supports natural language queries and will allow users to discover information buried deep within your organization’s vast content stores. Amazon Kendra will include connectors for popular data sources, along with an API to allow data ingestion from other sources. You can access the Kendra Preview from the AWS Management Console. Contact Lens for Amazon Connect – This is a set of analytics capabilities for Amazon Connect that use machine learning to understand sentiment and trends within customer conversations in your contact center. Once enabled, specified calls are automatically transcribed using state-of-the-art machine learning techniques, fed through a natural language processing engine to extract sentiment, and indexed for searching. Contact center supervisors and analysts can look for trends, compliance risks, or contacts based on specific words and phrases mentioned in the call to effectively train agents, replicate successful interactions, and identify crucial company and product feedback. Sign up for the Contact Lens for Amazon Connect Preview. Amazon Augmented AI (A2I) – This service will make it easy for you to build workflows that use a human to review low-confidence machine learning predictions. The service includes built-in workflows for common machine learning use cases including content moderation (via Amazon Rekognition) and text extraction (via Amazon Textract), and also allows you to create your own. You can use a pool of reviewers within your own organization, or you can access the workforce of over 500,000 independent contractors who are already performing machine learning tasks through Amazon Mechanical Turk. You can also make use of workforce vendors that are pre-screened by AWS for quality and adherence to security procedures. To learn more, read about Amazon Augmented AI (Amazon A2I), or visit the A2I Console to get started. Amazon CodeGuru – This ML-powered service provides code reviews and application performance recommendations. It helps to find the most expensive (computationally speaking) lines of code, and gives you specific recommendations on how to fix or improve them. It has been trained on best practices learned from millions of code reviews, along with code from thousands of Amazon projects and the top 10,000 open source projects. It can identify resource leaks, data race conditions between concurrent threads, and wasted CPU cycles. To learn more, visit the Amazon CodeGuru page. Amazon RDS Proxy – This is a fully managed database proxy that will help you better scale applications, including those built on modern serverless architectures, without worrying about managing connections and connection pools, while also benefiting from faster failover in the event of a database outage. It is highly available and deployed across multiple AZs, and integrates with IAM and AWS Secrets Manager so that you don’t have to embed your database credentials in your code. Amazon RDS Proxy is fully compatible with MySQL protocol and requires no application change. You will be able to create proxy endpoints and start using them in minutes. To learn more, visit the RDS Proxy page. — Jeff;

New – AWS Step Functions Express Workflows: High Performance & Low Cost

We launched AWS Step Functions at re:Invent 2016, and our customers took to the service right away, using them as a core element of their multi-step workflows. Today, we see customers building serverless workflows that orchestrate machine learning training, report generation, order processing, IT automation, and many other multi-step processes. These workflows can run for up to a year, and are built around a workflow model that includes checkpointing, retries for transient failures, and detailed state tracking for auditing purposes. Based on usage and feedback, our customers really like the core Step Functions model. They love the declarative specifications and the ease with which they can build, test, and scale their workflows. In fact, customers like Step Functions so much that they want to use them for high-volume, short-duration use cases such as IoT data ingestion, streaming data processing, and mobile application backends. New Express Workflows Today we are launching Express Workflows as an option to the existing Standard Workflows. The Express Workflows use the same declarative specification model (the Amazon States Language) but are designed for those high-volume, short-duration use cases. Here’s what you need to know: Triggering – You can use events and read/write API calls associated with a long list of AWS services to trigger execution of your Express Workflows. Execution Model – Express Workflows use an at-least-once execution model, and will not attempt to automatically retry any failed steps, but you can use Retry and Catch, as described in Error Handling. The steps are not checkpointed, so per-step status information is not available. Successes and failures are logged to CloudWatch Logs, and you have full control over the logging level. Workflow Steps – Express Workflows support many of the same service integrations as Standard Workflows, with the exception of Activity Tasks. You can initiate long-running services such as AWS Batch, AWS Glue, and Amazon SageMaker, but you cannot wait for them to complete. Duration – Express Workflows can run for up to five minutes of wall-clock time. They can invoke other Express or Standard Workflows, but cannot wait for them to complete. You can also invoke Express Workflows from Standard Workflows, composing both types in order to meet the needs of your application. Event Rate – Express Workflows are designed to support a per-account invocation rate greater than 100,000 events per second. Accounts are configured for 6,000 events per second by default and we will, as usual, raise it on request. Pricing – Standard Workflows are priced based on the number of state transitions. Express Workflows are priced based on the number of invocations and a GB/second charge based on the amount of memory used to track the state of the workflow during execution. While the pricing models are not directly comparable, Express Workflows will be far more cost-effective at scale. To learn more, read about AWS Step Functions Pricing. As you can see, most of what you already know about Standard Workflows also applies to Express Workflows! You can replace some of your Standard Workflows with Express Workflows, and you can use Express Workflows to build new types of applications. Using Express Workflows I can create an Express Workflow and attach it to any desired events with just a few minutes of work. I simply choose the Express type in the console: Then I define my state machine: I configure the CloudWatch logging, and add a tag: Now I can attach my Express Workflow to my event source. I open the EventBridge Console and create a new rule: I define a pattern that matches PutObject events on a single S3 bucket: I select my Express Workflow as the event target, add a tag, and click Create: The particular event will occur only if I have a CloudTrail trail that is set up to record object-level activity: Then I upload an image to my bucket, and check the CloudWatch Logs group to confirm that my workflow ran as expected: As a more realistic test, I can upload several hundred images at once and confirm that my Lambda functions are invoked with high concurrency: I can also use the new Monitoring tab in the Step Functions console to view the metrics that are specific to the state machine: Available Now You can create and use AWS Step Functions Express Workflows today in all AWS Regions! — Jeff;

New – Provisioned Concurrency for Lambda Functions

It’s really true that time flies, especially when you don’t have to think about servers: AWS Lambda just turned 5 years old and the team is always looking for new ways to help customers build and run applications in an easier way. As more mission critical applications move to serverless, customers need more control over the performance of their applications. Today we are launching Provisioned Concurrency, a feature that keeps functions initialized and hyper-ready to respond in double-digit milliseconds. This is ideal for implementing interactive services, such as web and mobile backends, latency-sensitive microservices, or synchronous APIs. When you invoke a Lambda function, the invocation is routed to an execution environment to process the request. When a function has not been used for some time, when you need to process more concurrent invocations, or when you update a function, new execution environments are created. The creation of an execution environment takes care of installing the function code and starting the runtime. Depending on the size of your deployment package, and the initialization time of the runtime and of your code, this can introduce latency for the invocations that are routed to a new execution environment. This latency is usually referred to as a “cold start”. For most applications this additional latency is not a problem. For some applications, however, this latency may not be acceptable. When you enable Provisioned Concurrency for a function, the Lambda service will initialize the requested number of execution environments so they can be ready to respond to invocations. Configuring Provisioned Concurrency I create two Lambda functions that use the same Java code and can be triggered by Amazon API Gateway. To simulate a production workload, these functions are repeating some mathematical computation 10 million times in the initialization phase and 200,000 times for each invocation. The computation is using java.Math.Random and conditions (if ...) to avoid compiler optimizations (such as “unlooping” the iterations). Each function has 1GB of memory and the size of the code is 1.7MB. I want to enable Provisioned Concurrency only for one of the two functions, so that I can compare how they react to a similar workload. In the Lambda console, I select one the functions. In the configuration tab, I see the new Provisioned Concurrency settings. I select Add configuration. Provisioned Concurrency can be enabled for a specific Lambda function version or alias (you can’t use $LATEST). You can have different settings for each version of a function. Using an alias, it is easier to enable these settings to the correct version of your function. In my case I select the alias live that I keep updated to the latest version using the AWS SAM AutoPublishAlias function preference. For the Provisioned Concurrency, I enter 500 and Save. Now, the Provisioned Concurrency configuration is in progress. The execution environments are being prepared to serve concurrent incoming requests based on my input. During this time the function remains available and continues to serve traffic. After a few minutes, the concurrency is ready. With these settings, up to 500 concurrent requests will find an execution environment ready to process them. If I go above that, the usual scaling of Lambda functions still applies. To generate some load, I use an Amazon Elastic Compute Cloud (EC2) instance in the same region. To keep it simple, I use the ab tool bundled with the Apache HTTP Server to call the two API endpoints 10,000 times with a concurrency of 500. Since these are new functions, I expect that: For the function with Provisioned Concurrency enabled and set to 500, my requests are managed by pre-initialized execution environments. For the other function, that has Provisioned Concurrency disabled, about 500 execution environments need to be provisioned, adding some latency to the same amount of invocations, about 5% of the total. One cool feature of the ab tool is that is reporting the percentage of the requests served within a certain time. That is a very good way to look at API latency, as described in this post on Serverless Latency by Tim Bray. Here are the results for the function with Provisioned Concurrency disabled: Percentage of the requests served within a certain time (ms) 50% 351 66% 359 75% 383 80% 396 90% 435 95% 1357 98% 1619 99% 1657 100% 1923 (longest request) Looking at these numbers, I see that 50% the requests are served within 351ms, 66% of the requests within 359ms, and so on. It’s clear that something happens when I look at 95% or more of the requests: the time suddenly increases by about a second. These are the results for the function with Provisioned Concurrency enabled: Percentage of the requests served within a certain time (ms) 50% 352 66% 368 75% 382 80% 387 90% 400 95% 415 98% 447 99% 513 100% 593 (longest request) Let’s compare those numbers in a graph. As expected for my test workload, I see a big difference in the response time of the slowest 5% of the requests (between 95% and 100%), where the function with Provisioned Concurrency disabled shows the latency added by the creation of new execution environments and the (slow) initialization in my function code. In general, the amount of latency added depends on the runtime you use, the size of your code, and the initialization required by your code to be ready for a first invocation. As a result, the added latency can be more, or less, than what I experienced here. The number of invocations affected by this additional latency depends on how often the Lambda service needs to create new execution environments. Usually that happens when the number of concurrent invocations increases beyond what already provisioned, or when you deploy a new version of a function. A small percentage of slow response times (generally referred to as tail latency) really makes a difference in end user experience. Over an extended period of time, most users are affected during some of their interactions. With Provisioned Concurrency enabled, user experience is much more stable. Provisioned Concurrency is a Lambda feature and works with any trigger. For example, you can use it with WebSockets APIs, GraphQL resolvers, or IoT Rules. This feature gives you more control when building serverless applications that require low latency, such as web and mobile apps, games, or any service that is part of a complex transaction. Available Now Provisioned Concurrency can be configured using the console, the AWS Command Line Interface (CLI), or AWS SDKs for new or existing Lambda functions, and is available today in the following AWS Regions: in US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and Europe (Stockholm), Middle East (Bahrain), and South America (São Paulo). You can use the AWS Serverless Application Model (SAM) and SAM CLI to test, deploy and manage serverless applications that use Provisioned Concurrency. With Application Auto Scaling you can automate configuring the required concurrency for your functions. As policies, Target Tracking and Scheduled Scaling are supported. Using these policies, you can automatically increase the amount of concurrency during times of high demand and decrease it when the demand decreases. You can also use Provisioned Concurrency today with AWS Partner tools, including configuring Provisioned Currency settings with the Serverless Framework and Terraform, or viewing metrics with Datadog, Epsagon, Lumigo, New Relic, SignalFx, SumoLogic, and Thundra. You only pay for the amount of concurrency that you configure and for the period of time that you configure it. Pricing in US East (N. Virginia) is $0.015 per GB-hour for Provisioned Concurrency and $0.035 per GB-hour for Duration. The number of requests is charged at the same rate as normal functions. You can find more information in the Lambda pricing page. This new feature enables developers to use Lambda for a variety of workloads that require highly consistent latency. Let me know what you are going to use it for! — Danilo

AWS ECS Cluster Auto Scaling is Now Generally Available

Today, we have launched AWS ECS Cluster Auto Scaling. This new capability improves your cluster scaling experience by increasing the speed and reliability of cluster scale-out, giving you control over the amount of spare capacity maintained in your cluster, and automatically managing instance termination on cluster scale-in. To enable ECS Cluster Auto Scaling, you will need to create a new ECS resource type called a Capacity Provider. A Capacity Provider can be associated with an EC2 Auto Scaling Group (ASG). When you associate an ECS Capacity Provider with an ASG and add the Capacity Provider to an ECS cluster, the cluster can now scale your ASG automatically by using two new features of ECS: Managed scaling, with an automatically-created scaling policy on your ASG, and a new scaling metric (Capacity Provider Reservation) that the scaling policy uses; and Managed instance termination protection, which enables container-aware termination of instances in the ASG when scale-in happens. These new features will give customers greater control of when and how Amazon ECS clusters scale-in and scale-out. Capacity Provider Reservation The new metric, called capacity provider reservation, measures the total percentage of cluster resources needed by all ECS workloads in the cluster, including existing workloads, new workloads, and changes in workload size. This metric enables the scaling policy to scale out quicker and more reliably than it could when using CPU or memory reservation metrics. Customers can also use this metric to reserve spare capacity in their clusters. Reserving spare capacity allows customers to run more containers immediately if needed, without waiting for new instances to start. Managed Instance Termination Protection With instance termination protection, ECS controls which instances the scaling policy is allowed to terminate on scale-in, to minimize disruptions of running containers. These improvements help customers achieve lower operational costs and higher availability of their container workloads running on ECS. How This Help Customers Customers running scalable container workloads on ECS often use metric-based scaling policies to automatically scale their ECS clusters. These scaling policies use generic metrics such as average cluster CPU and memory reservation percentages to determine when the policy should add or remove cluster instances. Clusters running a single workload, or workloads that scale-out slowly, often work well with such policies. However, customers running multiple workloads in the same cluster, or workloads that scale-out rapidly, are more likely to experience problems with cluster scaling. Ideally, increases in workload size that cannot be accommodated by the current cluster should trigger the policy to scale the cluster out to a larger size. Because the existing metrics are not container-specific and account only for resources already in use, this may happen slowly or be unreliable. Furthermore, because the scaling policy does not know where containers are running in the cluster, it can unnecessarily terminate containers when scaling in. These issues can reduce the availability of container workloads. Mitigations such as over-provisioning, custom tooling, or manual intervention often impose high operational costs. Enough Talk, Let’s Scale To understand these new features more clearly, I think it’s helpful to work through an example. Amazon ECS Cluster Auto Scaling can be set up and configured using the AWS Management Console, AWS CLI, or Amazon ECS API. I’m going to open up my terminal and create a cluster. Firstly, I create two files. The first file is called demo-launchconfig.json and defines the instance configuration for the Amazon Elastic Compute Cloud (EC2) instances that will make up my auto scaling group. { "LaunchConfigurationName": "demo-launchconfig", "ImageId": "ami-01f07b3fa86406c96", "SecurityGroups": [ "sg-0fa5be8c3749f3aa0" ], "InstanceType": "t2.micro", "BlockDeviceMappings": [ { "DeviceName": "/dev/xvdcz", "Ebs": { "VolumeSize": 22, "VolumeType": "gp2", "DeleteOnTermination": true, "Encrypted": true } } ], "InstanceMonitoring": { "Enabled": false }, "IamInstanceProfile": "arn:aws:iam::365489315573:role/ecsInstanceRole", "AssociatePublicIpAddress": true } The second file is demo-userdata.txt, and it contains the user data that will be added to each EC2 instance. The ECS_CLUSTER name included in the file must be the same as the name of the cluster we are going to create. In my case, the name is demo-news-blog-scale. #!/bin/bash echo ECS_CLUSTER=demo-news-blog-scale >> /etc/ecs/ecs.config Using the create-launch-configuration command, I pass the two files I created as inputs, this will create the launch configuration that I will use in my auto scaling group. aws autoscaling create-launch-configuration --cli-input-json file://demo-launchconfig.json --user-data file://demo-userdata.txt Next, I create a file called demo-asgconfig.json and define my requirements. { "LaunchConfigurationName": "demo-launchconfig", "MinSize": 0, "MaxSize": 100, "DesiredCapacity": 0, "DefaultCooldown": 300, "AvailabilityZones": [ "ap-southeast-1c" ], "HealthCheckType": "EC2", "HealthCheckGracePeriod": 300, "VPCZoneIdentifier": "subnet-abcd1234", "TerminationPolicies": [ "DEFAULT" ], "NewInstancesProtectedFromScaleIn": true, "ServiceLinkedRoleARN": "arn:aws:iam::111122223333:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling" } I then use the create-auto-scaling-group command to create an auto scaling group called demo-asg using the above file as an input. aws autoscaling create-auto-scaling-group --auto-scaling-group-name demo-asg --cli-input-json file://demo-asgconfig.json I am now ready to create a capacity provider. I create a file called demo-capacityprovider.json, importantly, I set the managedTerminationProtection property to ENABLED. { "name": "demo-capacityprovider", "autoScalingGroupProvider": { "autoScalingGroupArn": "arn:aws:autoscaling:ap-southeast-1:365489315573:autoScalingGroup:e9c2f0c4-9a4c-428e-b81e-b22411a52954:autoScalingGroupName/demo-ASG", "managedScaling": { "status": "ENABLED", "targetCapacity": 100, "minimumScalingStepSize": 1, "maximumScalingStepSize": 100 }, "managedTerminationProtection": "ENABLED" } } I then use the new create-capacity-provider command to create a provider using the file as an input. aws ecs create-capacity-provider --cli-input-json file://demo-capacityprovider.json Now all the components have been created, I can finally create a cluster. I add the capacity provider and set the default capacity provider for the cluster as demo-capacityprovider. aws ecs create-cluster --cluster-name demo-news-blog-scale --capacity-providers demo-capacityprovider --default-capacity-provider-strategy<br />capacityProvider=demo-capacityprovider,weight=1 I now need to wait until the cluster has moved into the active state. I use the following command to get details about the cluster. aws ecs describe-clusters --clusters demo-news-blog-scale --include ATTACHMENTS Now that my cluster is set up, I can register some tasks. Firstly I will need to create a task definition. Below is a file I. have created called demo-sleep-taskdef.json. All this definition does is define a container that sleeps for infinity. { "family": "demo-sleep-taskdef", "containerDefinitions": [ { "name": "sleep", "image": "amazonlinux:2", "memory": 20, "essential": true, "command": [ "sh", "-c", "sleep infinity"] }], "requiresCompatibilities": [ "EC2"] } I then register the task definition using the register-task-definition command. aws ecs register-task-definition --cli-input-json file://demo-sleep-taskdef.json Finally, I can create my tasks. In this case, I have created 5 tasks based on the demo-sleep-taskdef:1 definition that I just registered. aws ecs run-task --cluster demo-news-blog-scale --count 5 --task-definition demo-sleep-taskdef:1 Now because instances are not yet available to run the tasks, the tasks go into a provisioning state, which means they are waiting for capacity to become available. The capacity provider I configured will now scale-out the auto scaling group so that instances start up and join the cluster – at which point the tasks get placed on the instances. This gives a true “scale from zero” capability, which did not previously exist. Things To Know AWS ECS Cluster Auto Scaling is now available in all regions where Amazon ECS and AWS Auto Scaling are available – check the region table for the latest list. Happy Scaling! — Martin  

AWS Fargate Spot Now Generally Available

Today at AWS re:Invent 2019 we announced AWS Fargate Spot. Fargate Spot is a new capability on AWS Fargate that can run interruption tolerant Amazon Elastic Container Service (Amazon ECS) Tasks at up to a 70% discount off the Fargate price. If you are familiar with EC2 Spot Instances, the concept is the same. We use spare capacity in the AWS cloud to run your tasks. When the capacity for Fargate Spot is available, you will be able to launch tasks based on your specified request. When AWS needs the capacity back, tasks running on Fargate Spot will be interrupted with two minutes of notification. If the capacity for Fargate Spot stops being available, Fargate will scale down tasks running on Fargate Spot while maintaining any regular tasks you are running. As your tasks could be interrupted, you should not run tasks on Fargate Spot that cannot tolerate interruptions. However, for your fault-tolerant workloads, this feature enables you to optimize your costs. The service is an obvious fit for parallelizable workloads like image rendering, Monte Carlo simulations, and genomic processing. However, customers can also use Fargate Spot for tasks that run as a part of ECS services such as websites and APIs which require high availability. When configuring your Service Autoscaling policy, you can specify the minimum number of regular tasks that should run at all times and then add tasks running on Fargate Spot to improve service performance in a cost-efficient way. When the capacity for  Fargate Spot is available, the Scheduler will launch tasks to meet your request. If the capacity for Fargate Spot stops being available,  Fargate Spot will scale down, while maintaining the minimum number of regular tasks to ensure the application’s availability. So let us take a look at how we can get started using  AWS Fargate Spot. First, I create a new Fargate cluster inside of the ECS console, I choose Networking only, and I follow the wizard to complete the process. Once my cluster is created, I need to add a capacity provider, by default, my cluster has two capacity providers FARGATE and FARGATE_SPOT. To use the FARGATE_SPOT capacity provider, I update my cluster and set the default provider to use FARGATE_SPOT, I press the Update Cluster button and then select FARGATE_SPOT as the default capacity provider and click Update. I then run a task in the cluster in the usual way. I select my task definition and enter that I want 10 tasks. Then after configuring VPC and security groups, I click Run Task Now the 10 tasks run, but rather than using regular Fargate infrastructure, they use Fargate Spot. If I peek inside one of the tasks, I can verify that the task is indeed using the FARGATE-SPOT capacity provider. So that’s how you get started with Fargate Spot, you can try yourself right now. A few weeks ago, we saw the release of Compute Savings Plans (of which Fargate is a part) and now with Fargate Spot, customers can save a great deal of money and run many different types of applications; there has never been a better time to be using Fargate. AWS Fargate Spot is available in all regions where AWS Fargate is available, so you can try it yourself today. — Martin

New – EBS Direct APIs – Programmatic Access to EBS Snapshot Content

EBS Snapshots are really cool! You can create them interactively from the AWS Management Console: You can create them from the Command Line (create-snapshot) or by making a call to the CreateSnapshot function, and you can use the Data Lifecycle Manager (DLM) to set up automated snapshot management. All About Snapshots The snapshots are stored in Amazon Simple Storage Service (S3), and can be used to quickly create fresh EBS volumes as needed. The first snapshot of a volume contains a copy of every 512K block on the volume. Subsequent snapshots contain only the blocks that have changed since the previous snapshot. The incremental nature of the snapshots makes them very cost-effective, since (statistically speaking) many of the blocks on an EBS volume do not change all that often. Let’s look at a quick example. Suppose that I create and format an EBS volume with 8 blocks (this is smaller than the allowable minimum size, but bear with me), copy some files to it, and then create my first snapshot (Snap1). The snapshot contains all of the blocks, and looks like this: Then I add a few more files, delete one, and create my second snapshot (Snap2). The snapshot contains only the blocks that were modified after I created the first one, and looks like this: I make a few more changes, and create a third snapshot (Snap3): Keep in mind that the relationship between directories, files, and the underlying blocks is controlled by the file system, and is generally quite complex in real-world situations. Ok, so now I have three snapshots, and want to use them to create a new volume. Each time I create a snapshot of an EBS volume, an internal reference to the previous snapshot is created. This allows CreateVolume to find the most recent copy of each block, like this: EBS manages all of the details for me behind the scenes. For example, if I delete Snap2, the copy of Block 0 in the snapshot also deleted since the copy in Snap3 is newer, but the copy of Block 4 in Snap2 becomes part of Snap3: By the way, the chain of backward references (Snap3 to Snap1, or Snap3 to Snap2 to Snap1) is referred to as the lineage of the set of snapshots. Now that I have explained all this, I should also tell you that you generally don’t need to know this, and can focus on creating, using, and deleting snapshots! However… Access to Snapshot Content Today we are introducing EBS direct APIs that provide you with access to the snapshot content, as described above. These APIs are designed for developers of backup/recovery, disaster recovery, and data management products & services, and will allow them to make their offerings faster and more cost-effective. The new APIs use a block index (0, 1, 2, and so forth), to identify a particular 512K block within a snapshot. The index is returned in the form of an encrypted token, which is meaningful only to the GetSnapshotBlock API. I have represented these tokens as T0, T1, and so forth below. The APIs currently work on blocks of 512K bytes, with plans to support more block sizes in the future. Here are the APIs : ListSnapshotBlocks – Identifies all of the blocks in a given snapshot as encrypted tokens. For Snap1, it would return [T0, T1, T2, T3, T4, T5, T6, T7] and for Snap2 it would return [T0, T4]. GetSnapshotBlock – Returns the content of a block. If the block is part of an encrypted snapshot, it will be returned in decrypted form. ListChangedBlocks – Returns the list of blocks that have changed between two snapshots in a lineage, again as encrypted tokens. For Snap2 it would return [T0, T4] and for Snap3 it would return [T0, T5]. Like I said, these APIs were built to address one specialized yet very important use case. Having said that, I am now highly confident that new and unexpected ones will pop up within 48 hours (feel free to share them with me)! Available Now The EBS direct APIs are available now and you can start using them today in the US East (N. Virginia), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Singapore), and Asia Pacific (Tokyo) Regions; they will become available in the remaining regions in the next few weeks. There is a charge for calls to the List and Get APIs, and the usual KMS charges will apply when you call GetSnapshotBlock to access a block that is part of an encrypted snapshot. — Jeff;

AWS Transit Gateway Adds Multicast and Inter-Regional Peering

AWS Transit Gateway is a service that enables customers to connect thousands of Amazon Virtual Private Clouds (VPCs) and their on-premises networks using a single gateway. Customers have been enjoying the reduction in operational costs and the overall simplicity that this service brings. Still, today things got even better, with the release of two new features, AWS Transit Gateway inter-region peering and AWS Transit Gateway multicast support. Inter-Region Peering As customers expand workloads on AWS, they need to scale their networks across multiple accounts and VPCs, customers can connect pairs of VPCs using peering or use PrivateLink to expose private service endpoints from one VPC to another. However, managing this is complicated. AWS Transit Gateway inter-region peering addresses this and makes it easy to create secure and private global networks across multiple AWS regions. Using inter-region peering customers can create centralised routing policies between the different networks in their organisation and simplify management and reduce costs. All the traffic that flows through inter-region peering is anonymized and encrypted and is carried by the AWS backbone, ensuring it always takes the optimal path between regions in the most secure way. Multicast AWS Transit Gateway multicast makes it easy for customers to build multicast applications in the cloud and distribute data across thousands of connected Virtual Private Cloud networks. Multicast delivers a single stream of data to many users simultaneously. It is a preferred protocol to stream multimedia content and subscription data such as news articles and stock quotes, to a group of subscribers. AWS is the first cloud provider to offer a native multicast solution which will enable customers to migrate their applications to the cloud and take advantage of the elasticity and scalability that AWS provides. With this release we are introducing multicast domains in transit gateways. Similar to routing domains, multicast domains allow you to segment your multicast network into different domains and makes the transit gateway act as multiple multicast routers. Available Now These two new features are ready and waiting for you to try today. Inter-region peering is available in US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and EU (Frankfurt) and Multicast is available in US East (N. Virginia). — Martin

AWS Compute Optimizer – Your Customized Resource Optimization Service

When I publicly speak about Amazon EC2 instance type, one frequently asked question I receive is “How can I be sure I chose the right instance type for my application?” Choosing the correct instance type is between art and science. It usually involves knowing your application performance characteristics under normal circumstances (the baseline) and the expected daily variations, and to pick up an instance type that matches these characteristics. After that, you monitor key metrics to validate your choice, and you iterate over time to adjust the instance type that best suits the cost vs performance ratio for your application. Over-provisioning resources results in paying too much for your infrastructure, and under-provisioning resources results in lower application performance, possibly impacting customer experience. Earlier this year, we launched Cost Explorer Rightsizing Recommendations, which helps you identify under-utilized Amazon Elastic Compute Cloud (EC2) instances that may be downsized within the same family to save money. We received great feedback and customers are asking for more recommendations beyond just downsizing within the same instance family. Today, we are announcing a new service to help you to optimize compute resources for your workloads: AWS Compute Optimizer. AWS Compute Optimizer uses machine learning techniques to analyze the history of resource consumption on your account, and make well-articulated and actionable recommendations tailored to your resource usage. AWS Compute Optimizer is integrated to AWS Organizations, you can view recommendations for multiple accounts from your master AWS Organizations account. To get started with AWS Compute Optimizer, I navigate to the AWS Management Console, select AWS Compute Optimizer, and activate the service. It immediately starts to analyze my resource usage and history using Amazon CloudWatch metrics and delivers the first recommendations a few hours later. I can see the first recommendations on the AWS Compute Optimizer dashboard: I click Over-provisioned: 8 instances to get the details: I click on one of the eight links to get the actionable findings: AWS Compute Optimizer offers multiple options. I scroll down the bottom of that page to verify what is the impact if I decide to apply this recommendation: I can also access the recommendation from the AWS Command Line Interface (CLI): $ aws compute-optimizer get-ec2-instance-recommendations --instance-arns arn:aws:ec2:us-east-1:012345678912:instance/i-0218a45abd8b53658 { "instanceRecommendations": [ { "instanceArn": "arn:aws:ec2:us-east-1:012345678912:instance/i-0218a45abd8b53658", "accountId": "012345678912", "currentInstanceType": "m5.xlarge", "finding": "OVER_PROVISIONED", "utilizationMetrics": [ { "name": "CPU", "statistic": "MAXIMUM", "value": 2.0 } ], "lookBackPeriodInDays": 14.0, "recommendationOptions": [ { "instanceType": "r5.large", "projectedUtilizationMetrics": [ { "name": "CPU", "statistic": "MAXIMUM", "value": 3.2 } ], "performanceRisk": 1.0, "rank": 1 }, { "instanceType": "t3.xlarge", "projectedUtilizationMetrics": [ { "name": "CPU", "statistic": "MAXIMUM", "value": 2.0 } ], "performanceRisk": 3.0, "rank": 2 }, { "instanceType": "m5.xlarge", "projectedUtilizationMetrics": [ { "name": "CPU", "statistic": "MAXIMUM", "value": 2.0 } ], "performanceRisk": 1.0, "rank": 3 } ], "recommendationSources": [ { "recommendationSourceArn": "arn:aws:ec2:us-east-1:012345678912:instance/i-0218a45abd8b53658", "recommendationSourceType": "Ec2Instance" } ], "lastRefreshTimestamp": 1575006953.102 } ], "errors": [] } Keep in mind that AWS Compute Optimizer uses Amazon CloudWatch metrics as basis for the recommendations. By default, CloudWatch metrics are the ones it can observe from an hypervisor point of view, such as CPU utilization, disk IO, and network IO. If I want AWS Compute Optimizer to take into account operating system level metrics, such as memory usage, I need to install a CloudWatch agent on my EC2 instance. AWS Compute Optimizer automatically recognizes these metrics when available and takes these into account when creating recommendation, otherwise, it shows “Data Unavailable” in the console. AWS customers told us performance is not the only metric they look at when choosing a resource, the price vs performance ratio is important too. For example, it might make sense to use a new generation instance family, such as m5, rather than the older generation (m3 or m4), even when the new generation seems over-provisioned for the workload. This is why, after AWS Compute Optimizer identifies a list of optimal AWS resources for your workload, it presents on-demand pricing, reserved instance pricing, reserved instance utilization, and reserved instance coverage, along with expected resource efficiency to its recommendations. AWS Compute Optimizer makes it easy to right-size your resource. However, keep in mind that while it is relatively easy to right-size resources for modern applications, or stateless applications that scale horizontally, it might be very difficult to right-size older apps. Some older apps might not run correctly under different hardware architecture, or need different drivers, or not be supported by the application vendor at all. Be sure to check with your vendor before trying to optimize cloud resources for packaged or older apps. We strongly advise you to thoroughly test your applications on the new recommended instance type before applying any recommendations into production. Compute Optimizer is free to use and available initially in these AWS Regions: US East (N. Virginia), US West (Oregon), Europe (Ireland), US East (Ohio), South America (São Paulo). Connect to the AWS Management Console today and discover how much you can save by choosing the right resource size for your cloud applications. -- seb

New for AWS Transit Gateway – Build Global Networks and Centralize Monitoring Using Network Manager

As your company grows and gets the benefits of a cloud-based infrastructure, your on-premises sites like offices and stores increasingly need high performance private connectivity to AWS and to other sites at a reasonable cost. Growing your network is hard, because traditional branch networks based on leased lines are costly, and they suffer from the same lack of elasticity and agility as traditional data centers. At the same time, it becomes increasingly complex to manage and monitor a global network that is spread across AWS Regions and on-premises sites. You need to stitch together data from these diverse locations. This results in an inconsistent operational experience, increased costs and efforts, and missed insights from the lack of visibility across different technologies. Today, we want to make it easier to build, manage, and monitor global networks with the following new capabilities for AWS Transit Gateway: Transit Gateway inter-region peering Accelerated site-to-site VPN AWS Transit Gateway Network Manager These new networking capabilities enable you to optimize your network using AWS’s global backbone, and to centrally visualize and monitor your global network. More specifically: Inter-region peering and accelerated VPN improve application performance by leveraging the AWS Global Network. In this way, you can reduce the number of leased-lines required to operate your network, optimizing your cost and improving agility. Transit Gateway inter-region peering sends inter region traffic privately over AWS’s global network backbone. Accelerated VPN uses AWS Global Accelerator to route VPN traffic from remote locations through the closest AWS edge location to improve connection performance. Network Manager reduces the operational complexity of managing a global network across AWS and on-premises. With Network Manager, you set up a global view of your private network simply by registering your Transit Gateways and on-premises resources. Your global network can then be visualized and monitored via a centralized operational dashboard. These features allow you to optimize connectivity from on-premises sites to AWS and also between on-premises sites, by routing traffic through Transit Gateways and the AWS Global Network, and centrally managing through Network Manager. Visualizing Your Global Network In the Network Manager console, that you can reach from the Transit Gateways section of the Amazon Virtual Private Cloud console, you have an overview of your global networks. Each global network includes AWS and on-premises resources. Specifically, it provides a central point of management for your AWS Transit Gateways, your physical devices and sites connected to the Transit Gateways via Site-to-Site VPN Connections, and AWS Direct Connect locations attached to the Transit Gateways. For example, this is the Geographic view of a global network covering North America and Europe with 5 Transit Gateways in 3 AWS Regions, 80 VPCs, 50 VPNs, 1 Direct Connect location, and 16 on-premises sites with 50 devices: As I zoom in the map, I get a description on what these nodes represent, for example if they are AWS Regions, Direct Connect locations, or branch offices. I can select any node in the map to get more information. For example, I select the US West (Oregon) AWS Region to see the details of the two Transit Gateways I am using there, including the state of all VPN connections, VPCs, and VPNs handled by the selected Transit Gateway. Selecting a site, I get a centralized view with the status of the VPN connections, including site metadata such as address, location, and description. For example, here are the details of the Colorado branch offices. In the Topology panel, I see the logical relationship of all the resources in my network. On the left here, there is the entire topology of my global network, on the right the detail of the European part. Connections statuses are reported in color in the topology view. Selecting any node in the topology map displays details specific to the resource type (Transit Gateway, VPC, customer gateway, and so on) including links to the corresponding service in the AWS console to get more information and configure the resource. Monitoring Your Global Network Network Manager is using Amazon CloudWatch, which collects raw data and processes it into readable, near real-time metrics for data in/out, packets dropped, and VPN connection status. These statistics are kept for 15 months, so that you can access historical information and gain a better perspective on how your web application or service is performing. You can also set alarms that watch for certain thresholds, and send notifications or take actions when those thresholds are met. For example, these are the last 12 hours of Monitoring for the Transit Gateway in Europe (Ireland). In the global network view, you have a single point of view of all events affecting your network, simplifying root cause analysis in case of issues. Clicking on any of the messages in the console will take to a more detailed view in the Events tab. Your global network events are also delivered by CloudWatch Events. Using simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams. To process the same events, you can also use the additional capabilities offered by Amazon EventBridge. Network Manager sends the following types of events: Topology changes, for example when a VPN connection is created for a transit gateway. Routing updates, such as when a route is deleted in a transit gateway route table. Status updates, for example in case a VPN tunnel’s BGP session goes down. Configuring Your Global Network To get your on-premises resources included in the above visualizations and monitoring, you need to input into Network Manager information about your on-premises devices, sites, and links. You also need to associate devices with the customer gateways they host for VPN connections. Our software-defined wide area network (SD-WAN) partners, such as Cisco, Aruba, Silver Peak, and Aviatrix, have configured their SD-WAN devices to connect with Transit Gateway Network Manager in only a few clicks. Their SD-WANs also define the on-premises devices, sites, and links automatically in Network Manager. SD-WAN integrations enable to include your on-premises network in the Network Manager global dashboard view without requiring you to input information manually. Available Now AWS Transit Gateway Network Manager is a global service available for Transit Gateways in the following regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Europe (London), Europe (Paris), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Sydney), Asia Pacific (Mumbai), Canada (Central), South America (São Paulo). There is no additional cost for using Network Manager. You pay for the network resources you use, like Transit Gateways, VPNs, and so on. Here you can find more information on pricing for VPN and Transit Gateway. You can learn more in the documentation of the Network Manager, inter-region peering, and accelerated VPN. With these new features, you can take advantage of the performance of our AWS Global Network, and simplify network management and monitoring across your AWS and on-premises resources. — Danilo

New – VPC Ingress Routing – Simplifying Integration of Third-Party Appliances

When I was delivering the Architecting on AWS class, customers often asked me how to configure an Amazon Virtual Private Cloud to enforce the same network security policies in the cloud as they have on-premises. For example, to scan all ingress traffic with an Intrusion Detection System (IDS) appliance or to use the same firewall in the cloud as on-premises. Until today, the only answer I could provide was to route all traffic back from their VPC to an on-premises appliance or firewall in order to inspect the traffic with their usual networking gear before routing it back to the cloud. This is obviously not an ideal configuration, it adds latency and complexity. Today, we announce new VPC networking routing primitives to allow to route all incoming and outgoing traffic to/from an Internet Gateway (IGW) or Virtual Private Gateway (VGW) to a specific EC2 instance’s Elastic Network Interface. It means you can now configure your Virtual Private Cloud to send all traffic to an EC2 instance before the traffic reaches your business workloads. The instance typically runs network security tools to inspect or to block suspicious network traffic (such as IDS/IPS or Firewall) or to perform any other network traffic inspection before relaying the traffic to other EC2 instances. How Does it Work? To learn how it works, I wrote this CDK script to create a VPC with two public subnets: one subnet for the appliance and one subnet for a business application. The script launches two EC2 instances with public IP address, one in each subnet. The script creates the below architecture: This is a regular VPC, the subnets have routing tables to the Internet Gateway and the traffic flows in and out as expected. The application instance hosts a static web site, it is accessible from any browser. You can retrieve the application public DNS name from the EC2 Console (for your convenience, I also included the CLI version in the comments of the CDK script). AWS_REGION=us-west-2 APPLICATION_IP=$(aws ec2 describe-instances \ --region $AWS_REGION \ --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].Association.PublicDnsName" \ --output text) curl -I $APPLICATION_IP Configure Routing To configure routing, you need to know the VPC ID, the ENI ID of the ENI attached to the appliance instance, and the Internet Gateway ID. Assuming you created the infrastructure using the CDK script I provided, here are the commands I use to find these three IDs (be sure to adjust to the AWS region you use): AWS_REGION=us-west-2 VPC_ID=$(aws cloudformation describe-stacks \ --region $AWS_REGION \ --stack-name VpcIngressRoutingStack \ --query "Stacks[].Outputs[?OutputKey=='VPCID'].OutputValue" \ --output text) ENI_ID=$(aws ec2 describe-instances \ --region $AWS_REGION \ --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='appliance']].NetworkInterfaces[].NetworkInterfaceId" \ --output text) IGW_ID=$(aws ec2 describe-internet-gateways \ --region $AWS_REGION \ --query "InternetGateways[] | [?Attachments[?VpcId=='${VPC_ID}']].InternetGatewayId" \ --output text) To route all incoming traffic through my appliance, I create a routing table for the Internet Gateway and I attach a rule to direct all traffic to the EC2 instance Elastic Network Interface (ENI): # create a new routing table for the Internet Gateway ROUTE_TABLE_ID=$(aws ec2 create-route-table \ --region $AWS_REGION \ --vpc-id $VPC_ID \ --query "RouteTable.RouteTableId" \ --output text) # create a route for 10.0.1.0/24 pointing to the appliance ENI aws ec2 create-route \ --region $AWS_REGION \ --route-table-id $ROUTE_TABLE_ID \ --destination-cidr-block 10.0.1.0/24 \ --network-interface-id $ENI_ID # associate the routing table to the Internet Gateway aws ec2 associate-route-table \ --region $AWS_REGION \ --route-table-id $ROUTE_TABLE_ID \ --gateway-id $IGW_ID Alternatively, I can use the VPC Console under the new Edge Associations tab. To route all application outgoing traffic through the appliance, I replace the default route for the application subnet to point to the appliance’s ENI: SUBNET_ID=$(aws ec2 describe-instances \ --region $AWS_REGION \ --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].SubnetId" \ --output text) ROUTING_TABLE=$(aws ec2 describe-route-tables \ --region $AWS_REGION \ --query "RouteTables[?VpcId=='${VPC_ID}'] | [?Associations[?SubnetId=='${SUBNET_ID}']].RouteTableId" \ --output text) # delete the existing default route (the one pointing to the internet gateway) aws ec2 delete-route \ --region $AWS_REGION \ --route-table-id $ROUTING_TABLE \ --destination-cidr-block 0.0.0.0/0 # create a default route pointing to the appliance's ENI aws ec2 create-route \ --region $AWS_REGION \ --route-table-id $ROUTING_TABLE \ --destination-cidr-block 0.0.0.0/0 \ --network-interface-id $ENI_ID aws ec2 associate-route-table \ --region $AWS_REGION \ --route-table-id $ROUTING_TABLE \ --subnet-id $SUBNET_ID Alternatively, I can use the VPC Console. Within the correct routing table, I select the Routes tab and click Edit routes to replace the default route (the one pointing to 0.0.0.0/0) to target the appliance’s ENI. Now I have the routing configuration in place. The new routing looks like: Configure the Appliance Instance Finally, I configure the appliance instance to forward all traffic it receives. Your software appliance usually does that for you, no extra step is required when you use AWS Marketplace appliances. When using a plain Linux instance, two extra steps are required: 1. Connect to the EC2 appliance instance and configure IP traffic forwarding in the kernel: APPLIANCE_ID=$(aws ec2 describe-instances \ --region $AWS_REGION \ --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='appliance']].InstanceId" \ --output text) aws ssm start-session --region $AWS_REGION --target $APPLIANCE_ID ## ## once connected (you see the 'sh-4.2$' prompt), type: ## sudo sysctl -w net.ipv4.ip_forward=1 sudo sysctl -w net.ipv6.conf.all.forwarding=1 exit 2. Configure the EC2 instance to accept traffic for different destinations than itself (known as Dest/Source check) : aws ec2 modify-instance-attribute --region $AWS_REGION \ --no-source-dest-check \ --instance-id $APPLIANCE_ID Now, the appliance is ready to forward traffic to the other EC2 instances. You can test this by pointing your browser (or using `cURL`) to the application instance. APPLICATION_IP=$(aws ec2 describe-instances --region $AWS_REGION \ --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].Association.PublicDnsName" \ --output text) curl -I $APPLICATION_IP To verify the traffic is really flowing through the appliance, you can enable source/destination check on the instance again (use --source-dest-check parameter with the modify-instance-attributeCLI command above). The traffic is blocked when Source/Destination check is enabled. Cleanup Should you use the CDK script I provided for this article, be sure to run cdk destroy when finished. This ensures you are not billed for the two EC2 instances I use for this demo. As I modified routing tables behind the back of AWS CloudFormation, I need to manually delete the routing tables, the subnet, and the VPC. The easiest is to navigate to the VPC Console, select the VPC and click Actions => Delete VPC. The console deletes all components in the correct order. You might need to wait 5-10 minutes after the end of cdk destroy before the console is able to delete the VPC. Availability There are no additional costs to use Virtual Private Cloud ingress routing. It is available in all AWS Regions (including AWS GovCloud (US-West)) and you can start to use it today. You can learn more about gateway routing tables in the updated VPC documentation. What are the appliances you are going to use with this new VPC routing capability? -- seb

Amazon EC2 Update – Inf1 Instances with AWS Inferentia Chips for High Performance Cost-Effective Inferencing

Our customers are taking to Machine Learning in a big way. They are running many different types of workloads, including object detection, speech recognition, natural language processing, personalization, and fraud detection. When running on large-scale production workloads, it is essential that they can perform inferencing as quickly and as cost-effectively as possible. According to what they have told us, inferencing can account for up to 90% of the cost of their machine learning work. New Inf1 Instances Today we are launching Inf1 instances in four sizes. These instances are powered by AWS Inferentia chips, and are designed to provide you with fast, low-latency inferencing. AWS Inferentia chips are designed to accelerate the inferencing process. Each chip can deliver the following performance: 64 teraOPS on 16-bit floating point (FP16 and BF16) and mixed-precision data. 128 teraOPS on 8-bit integer (INT8) data. The chips also include a high-speed interconnect, and lots of memory. With 16 chips on the largest instance, your new and existing TensorFlow, PyTorch, and MxNet inferencing workloads can benefit from over 2 petaOPS of inferencing power. When compared to the G4 instances, the Inf1 instances offer up to 3x the inferencing throughput, and up to 40% lower cost per inference. Here are the sizes and specs: Instance Name Inferentia Chips vCPUs RAM EBS Bandwidth Network Bandwidth inf1.xlarge 1 4 8 GiB Up to 3.5 Gbps Up to 25 Gbps inf1.2xlarge 1 8 16 GiB Up to 3.5 Gbps Up to 25 Gbps inf1.6xlarge 4 24 48 GiB 3.5 Gbps 25 Gbps inf1.24xlarge 16 96 192 GiB 14 Gbps 100 Gbps The instances make use of custom Second Generation Intel® Xeon® Scalable (Cascade Lake) processors, and are available in On-Demand, Spot, and Reserved Instance form, or as part of a Savings Plan in the US East (N. Virginia) and US West (Oregon) Regions. You can launch the instances directly, and they will also be available soon through Amazon SageMaker and Amazon ECS, and Amazon Elastic Kubernetes Service. Using Inf1 Instances Amazon Deep Learning AMIs have been updated and contain versions of TensorFlow and MxNet that have been optimized for use in Inf1 instances, with PyTorch coming very soon. The AMIs contain the new AWS Neuron SDK, which contains commands to compile, optimize, and execute your ML models on the Inferentia chip. You can also include the SDK in your own AMIs and images. You can build and train your model on a GPU instance such as a P3 or P3dn, and then move it to an Inf1 instance for production use. You can use a model natively trained in FP16, or you can use models that have been trained to 32 bits of precision and have AWS Neuron automatically convert them to BF16 form. Large models, such as those for language translation or natural language processing, can be split across multiple Inferentia chips in order to reduce latency. The AWS Neuron SDK also allows you to assign models to Neuron Compute Groups, and to run them in parallel. This allows you to maximize hardware utilization and to use multiple models as part of Neuron Core Pipeline mode, taking advantage of the large on-chip cache on each Inferentia chip. Be sure to read the AWS Neuron SDK Tutorials to learn more! — Jeff;  

AWS Outposts Now Available – Order Yours Today!

We first discussed AWS Outposts at re:Invent 2018. Today, I am happy to announce that we are ready to take orders and install Outposts racks in your data center or colo facility. Why Outposts? This new and unique AWS offering is a comprehensive, single-vendor compute & storage solution that is designed to meet the needs of customers who need local processing and very low latency. You no longer need to spend time creating detailed hardware specifications, soliciting & managing bids from multiple disparate vendors, or racking & stacking individual servers. Instead, you place your order online, take delivery, and relax while trained AWS technicians install, connect, set up, and verify your Outposts. Once installed, we take care of monitoring, maintaining, and upgrading your Outposts. All of the hardware is modular and can be replaced in the field without downtime. When you need more processing or storage, or want to upgrade to newer generations of EC2 instances, you can initiate the request with a couple of clicks and we will take care of the rest. Everything that you and your team already know about AWS still applies. You use the same APIs, tools, and operational practices. You can create a single deployment pipeline that target your Outposts and your cloud-based environments, and you can create hybrid architectures that span both. Each Outpost is connected to and controlled by a specific AWS Region. The region treats a collection of up to 16 racks at a single location as a unified capacity pool. The collection can be associated with subnets of one or more VPCs in the parent region. Outposts Hardware The Outposts hardware is the same as what we use in our own data centers, with some additional security devices. The hardware is designed for reliability & efficiency, with redundant network switches and power supplies, and DC power distribution. Outpost racks are 80″ tall, 24″ wide, 48″ deep, and can weigh up to 2000 lbs. They arrive fully assembled, and roll in on casters, ready for connection to power and networking. To learn more about the Outposts hardware, watch my colleague Anthony Liguori explain it: Outposts supports multiple Intel®-powered Nitro-based EC2 instance types including C5, C5d, M5, M5d, R5, R5d, G4, and I3en. You can choose the mix of types that is right for your environment, and you can add more later. You will also be able to upgrade to newer instance types as they become available. On the storage side, Outposts support EBS gp2 (general purpose SSD) storage, with a minimum size of 2.7 TB. Outpost Networking Each Outpost has a pair of networking devices, each with 400 Gbps of connectivity and support for 1 GigE, 10 GigE, 40 GigE, and 100 Gigabit fiber connections. The connections are used to host a pair of Link Aggregation Groups, one for the link to the parent region, and another to your local network. The link to the parent region is used for control and VPC traffic; all connections originate from the Outpost. Traffic to and from your local network flows through a Local Gateway (LGW), giving you full control over access and routing. Here’s an overview of the networking topology within your premises: You will need to allocate a /26 CIDR block to each Outpost, which is advertised as a pair of /27 blocks in order to protect against device and link failures. The CIDR block can be within your own range of public IP addresses, or it can be an RFC 1918 private address plus NAT at your network edge. Outposts are simply new subnets on an existing VPC in the parent region. Here’s how to create one: $ aws ec2 create-subnet --vpc-id VVVVVV \ --cidr-block A.B.C.D/24 \ --outpost-arn arn:aws:outposts:REGION:ACCOUNT_ID:outpost:OUTPOST_ID If you have Cisco or Juniper hardware in your data center, the following guides will be helpful: Cisco – Outposts Solution Overview. To learn more about the partnership between AWS and Cisco, visit this page. Juniper – AWS Outposts in a Juniper QFX-Based Datacenter. In most cases you will want to use AWS Direct Connect to establish a connection between your Outposts and the parent AWS Region. For more information on this and to learn a lot more about how to plan your Outposts network model, consult the How it Works documentation. Outpost Services We are launching with support for Amazon Elastic Compute Cloud (EC2), Amazon Elastic Block Store (EBS), Amazon Virtual Private Cloud, Amazon ECS, Amazon Elastic Kubernetes Service, and Amazon EMR, with additional services in the works. Amazon RDS for PostgreSQL and Amazon RDS for MySQL are available in preview form. Your applications can also make use of any desired services in the parent region, including Amazon Simple Storage Service (S3), Amazon DynamoDB, Auto Scaling, AWS CloudFormation, Amazon CloudWatch, AWS CloudTrail, AWS Config, Load Balancing, and so forth. You can create and use Interface Endpoints from within the VPC, or you can access the services through the regional public endpoints. Services & applications in the parent region that launch, manage, or refer to EC2 instances or EBS volumes can operate on those objects within an Outpost with no changes. Purchasing an Outpost The process of purchasing an Outpost is a bit more involved than that of launching an EC2 instance or creating an S3 bucket, but it should be straightforward. I don’t actually have a data center, and won’t actually take delivery of an Outpost, but I’ll do my best to show you the actual experience! The first step is to describe and qualify my site. I enter my address: I confirm temperature, humidity, and airflow at the rack position, that my loading dock can accommodate the shipping crate, and that there’s a clear access path from the loading dock to the rack’s final resting position: I provide information about my site’s power configuration: And the networking configuration: After I create the site, I create my Outpost: Now I am ready to order my hardware. I can choose any one of 18 standard configurations, with varied amounts of compute capacity and storage (custom configurations are also available), and click Create order to proceed: The EC2 capacity shown above indicates the largest instance size of a particular type. I can launch instances of that size, or I can use the smaller sizes, as needed. For example, the the capacity of the OR-HUZEI16 configuration that I selected is listed as 7 m5.24xlarge instances and 3 c5.24xlarge instances. I could launch a total of 10 instances in those sizes, or (if I needed lots of smaller ones) I could launch 168 m5.xlarge instances and 72 c5.xlarge instances. I could also use a variety of sizes, subject to available capacity and the details of how the instances are assigned to the hardware. I confirm my order, choose the Outpost that I created earlier, and click Submit order: My order will be reviewed, my colleagues might give me a call to review some details, and my Outpost will be shipped to my site. A team of AWS installers will arrive to unpack & inspect the Outpost, transport it to its resting position in my data center, and work with my data center operations (DCO) team to get it connected and powered up. Once the Outpost is powered up and the network is configured, it will set itself up automatically. At that point I can return to the console and monitor capacity exceptions (situations where demand exceeds supply), capacity availability, and capacity utilization: Using an Outpost The next step is to set up one or more subnets in my Outpost, as shown above. Then I can launch EC2 instances and create EBS volumes in the subnet, just as I would with any other VPC subnet. I can ask for more capacity by selecting Increase capacity from the Actions menu: The AWS team will contact me within 3 business days to discuss my options. Things to Know Here are a couple of other things to keep in mind when thinking about using Outposts: Availability – Outposts are available in the following countries: North America (United States) Europe (All EU countries, Switzerland, Norway) Asia Pacific (Japan, South Korea, Australia) Support – You must subscribe to AWS Enterprise Support in order to purchase an Outpost. We will remotely monitor your Outpost, and keep it happy & healthy over time. We’ll look for failing components and arrange to replace them without disturbing your operations. Billing & Payment Options – You can purchase Outposts on a three-year term, with All Upfront, Partial Upfront, and No Upfront payment options. The purchase price covers all EC2 and EBS usage within the Outpost; other services are billed by the hour, with the EC2 and EBS portions removed. You pay the regular inter-AZ data transfer charge to move data between an Outpost and another subnet in the same VPC, and the usual AWS data transfer charge for data that exits to the Internet across the link to the parent region. Capacity Expansion – Today, you can group up to 16 racks into a single capacity pool. Over time we expect to allow you to group thousands of racks together in this manner. Stay Tuned This is, like most AWS announcements, just the starting point. We have a lot of cool stuff in the works, and it is still Day One for AWS Outposts! — Jeff;  

AWS Now Available from a Local Zone in Los Angeles

AWS customers are always asking for more features, more bandwidth, more compute power, and more memory, while also asking for lower latency and lower prices. We do our best to meet these competing demands: we launch new EC2 instance types, EBS volume types, and S3 storage classes at a rapid pace, and we also reduce prices regularly. AWS in Los Angeles Today we are launching a Local Zone in Los Angeles, California. The Local Zone is a new type of AWS infrastructure deployment that brings select AWS services very close to a particular geographic area. This Local Zone is designed to provide very low latency (single-digit milliseconds) to applications that are accessed from Los Angeles and other locations in Southern California. It will be of particular interest to highly-demanding applications that are particularly sensitive to latency. This includes: Media & Entertainment – Gaming, 3D modeling & rendering, video processing (including real-time color correction), video streaming, and media production pipelines. Electronic Design Automation – Interactive design & layout, simulation, and verification. Ad-Tech – Rapid decision making & ad serving. Machine Learning – Fast, continuous model training; high-performance low-latency inferencing. All About Local Zones The new Local Zone in Los Angeles is a logical part of the US West (Oregon) Region (which I will refer to as the parent region), and has some unique and interesting characteristics: Naming – The Local Zone can be accessed programmatically as us-west-2-lax-1a. All API, CLI, and Console access takes place through the us-west-2 API endpoint and the US West (Oregon) Console. Opt-In – You will need to opt in to the Local Zone in order to use it. After opting in, you can create a new VPC subnet in the Local Zone, taking advantage of all relevant VPC features including Security Groups, Network ACLs, and Route Tables. You can target the Local Zone when you launch EC2 instances and other resources, or you can create a default subnet in the VPC and have it happen automatically. Networking – The Local Zone in Los Angeles is connected to US West (Oregon) over Amazon’s private backbone network. Connections to the public internet take place across an Internet Gateway, giving you local ingress and egress to reduce latency. Elastic IP Addresses can be shared by a group of Local Zones in a particular geographic location, but they do not move between a Local Zone and the parent region. The Local Zone also supports AWS Direct Connect, giving you the opportunity to route your traffic over a private network connection. Services – We are launching with support for seven EC2 instance types (T3, C5, M5, R5, R5d, I3en, and G4), two EBS volume types (io1 and gp2), Amazon FSx for Windows File Server, Amazon FSx for Lustre, Application Load Balancer, and Amazon Virtual Private Cloud. Single-Zone RDS is on the near-term roadmap, and other services will come later based on customer demand. Applications running in a Local Zone can also make use of services in the parent region. Parent Region – As I mentioned earlier, the new Local Zone is a logical extension of the US West (Oregon) region, and is managed by the “control plane” in the region. API calls, CLI commands, and the AWS Management Console should use “us-west-2” or US West (Oregon). AWS – Other parts of AWS will continue to work as expected after you start to use this Local Zone. Your IAM resources, CloudFormation templates, and Organizations are still relevant and applicable, as are your tools and (perhaps most important) your investment in AWS training. Pricing & Billing – Instances and other AWS resources in Local Zones will have different prices than in the parent region. Billing reports will include a prefix that is specific to a group of Local Zones that share a physical location. EC2 instances are available in On Demand & Spot form, and you can also purchase Savings Plans. Using a Local Zone The first Local Zone is available today, and you can request access here: In early 2020, you will be able opt in using the console, CLI, or by API call. After opting in, I can list my AZs and see that the Local Zone is included: Then I create a new VPC subnet for the Local Zone. This gives me transparent, seamless connectivity between the parent zone in Oregon and the Local Zone in Los Angeles, all within the VPC: I can create EBS volumes: They are, as usual, ready within seconds: I can also see and use the Local Zone from within the AWS Management Console: I can also use the AWS APIs, CloudFormation templates, and so forth. Thinking Ahead Local Zones give you even more architectural flexibility. You can think big, and you can think different! You now have the components, tools, and services at your fingertips to build applications that make use of any conceivable combination of legacy on-premises resources, modern on-premises cloud resources via AWS Outposts, resources in a Local Zone, and resources in one or more AWS regions. In the fullness of time (as Andy Jassy often says), there could very well be more than one Local Zone in any given geographic area. In 2020, we will open a second one in Los Angeles (us-west-2-lax-1b), and are giving consideration to other locations. We would love to get your advice on locations, so feel free to leave me a comment or two! Now Available The Local Zone in Los Angeles is available now and you can start using it today. Learn more about Local Zones. — Jeff;  

Amazon SageMaker Studio: The First Fully Integrated Development Environment For Machine Learning

Today, we’re extremely happy to launch Amazon SageMaker Studio, the first fully integrated development environment (IDE) for machine learning (ML). We have come a long way since we launched Amazon SageMaker in 2017, and it is shown in the growing number of customers using the service. However, the ML development workflow is still very iterative, and is challenging for developers to manage due to the relative immaturity of ML tooling. Many of the tools which developers take for granted when building traditional software (debuggers, project management, collaboration, monitoring, and so forth) have yet been invented for ML. For example, when trying a new algorithm or tweaking hyper parameters, developers and data scientists typically run hundreds and thousands of experiments on Amazon SageMaker, and they need to manage all this manually. Over time, it becomes much harder to track the best performing models, and to capitalize on lessons learned during the course of experimentation. Amazon SageMaker Studio unifies at last all the tools needed for ML development. Developers can write code, track experiments, visualize data, and perform debugging and monitoring all within a single, integrated visual interface, which significantly boosts developer productivity. In addition, since all these steps of the ML workflow are tracked within the environment, developers can quickly move back and forth between steps, and also clone, tweak, and replay them. This gives developers the ability to make changes quickly, observe outcomes, and iterate faster, reducing the time to market for high quality ML solutions. Introducing Amazon SageMaker Studio Amazon SageMaker Studio lets you manage your entire ML workflow through a single pane of glass. Let me give you the whirlwind tour! With Amazon SageMaker Notebooks (currently in preview), you can enjoy an enhanced notebook experience that lets you easily create and share Jupyter notebooks. Without having to manage any infrastructure, you can also quickly switch from one hardware configuration to another. With Amazon SageMaker Experiments, you can organize, track and compare thousands of ML jobs: these can be training jobs, or data processing and model evaluation jobs run with Amazon SageMaker Processing. With Amazon SageMaker Debugger, you can debug and analyze complex training issues, and receive alerts. It automatically introspects your models, collects debugging data, and analyzes it to provide real-time alerts and advice on ways to optimize your training times, and improve model quality. All information is visible as your models are training. With Amazon SageMaker Model Monitor, you can detect quality deviations for deployed models, and receive alerts. You can easily visualize issues like data drift that could be affecting your models. No code needed: all it takes is a few clicks. With Amazon SageMaker Autopilot, you can build models automatically with full control and visibility. Algorithm selection, data preprocessing, and model tuning are taken care automatically, as well as all infrastructure. Thanks to these new capabilities, Amazon SageMaker now covers the complete ML workflow to build, train, and deploy machine learning models, quickly and at any scale. These services mentioned above, except for Amazon SageMaker Notebooks, are covered in individual blog posts (see below) showing you how to quickly get started, so keep your eyes peeled and read on! Amazon SageMaker Debugger Amazon SageMaker Model Monitor Amazon SageMaker Autopilot Amazon SageMaker Experiments Now Available! Amazon SageMaker Studio is available today in US East (Ohio). Give it a try, and please send us feedback either in the AWS forum for Amazon SageMaker, or through your usual AWS support contacts. - Julien

Amazon SageMaker Debugger – Debug Your Machine Learning Models

Today, we’re extremely happy to announce Amazon SageMaker Debugger, a new capability of Amazon SageMaker that automatically identifies complex issues developing in machine learning (ML) training jobs. Building and training ML models is a mix of science and craft (some would even say witchcraft). From collecting and preparing data sets to experimenting with different algorithms to figuring out optimal training parameters (the dreaded hyperparameters), ML practitioners need to clear quite a few hurdles to deliver high-performance models. This is the very reason why be built Amazon SageMaker : a modular, fully managed service that simplifies and speeds up ML workflows. As I keep finding out, ML seems to be one of Mr. Murphy’s favorite hangouts, and everything that may possibly go wrong often does! In particular, many obscure issues can happen during the training process, preventing your model from correctly extracting and learning patterns present in your data set. I’m not talking about software bugs in ML libraries (although they do happen too): most failed training jobs are caused by an inappropriate initialization of parameters, a poor combination of hyperparameters, a design issue in your own code, etc. To make things worse, these issues are rarely visible immediately: they grow over time, slowly but surely ruining your training process, and yielding low accuracy models. Let’s face it, even if you’re a bonafide expert, it’s devilishly difficult and time-consuming to identify them and hunt them down, which is why we built Amazon SageMaker Debugger. Let me tell you more. Introducing Amazon SageMaker Debugger In your existing training code for TensorFlow, Keras, Apache MXNet, PyTorch and XGBoost, you can use the new SageMaker Debugger SDK to save internal model state at periodic intervals; as you can guess, it will be stored in Amazon Simple Storage Service (S3). This state is composed of: The parameters being learned by the model, e.g. weights and biases for neural networks, The changes applied to these parameters by the optimizer, aka gradients, The optimization parameters themselves, Scalar values, e.g. accuracies and losses, The output of each layer, Etc. Each specific set of values – say, the sequence of gradients flowing over time through a specific neural network layer – is saved independently, and referred to as a tensor. Tensors are organized in collections (weights, gradients, etc.), and you can decide which ones you want to save during training. Then, using the SageMaker SDK and its estimators, you configure your training job as usual, passing additional parameters defining the rules you want SageMaker Debugger to apply. A rule is a piece of Python code that analyses tensors for the model in training, looking for specific unwanted conditions. Pre-defined rules are available for common problems such as exploding/vanishing tensors (parameters reaching NaN or zero values), exploding/vanishing gradients, loss not changing, and more. Of course, you can also write your own rules. Once the SageMaker estimator is configured, you can launch the training job. Immediately, it fires up a debug job for each rule that you configured, and they start inspecting available tensors. If a debug job detects a problem, it stops and logs additional information. A CloudWatch Events event is also sent, should you want to trigger additional automated steps. So now you know that your deep learning job suffers from say, vanishing gradients. With a little brainstorming and experience, you’ll know where to look: maybe the neural network is too deep? Maybe your learning rate is too small? As the internal state has been saved to S3, you can now use the SageMaker Debugger SDK to explore the evolution of tensors over time, confirm your hypothesis and fix the root cause. Let’s see SageMaker Debugger in action with a quick demo. Debugging Machine Learning Models with Amazon SageMaker Debugger At the core of SageMaker Debugger is the ability to capture tensors during training. This requires a little bit of instrumentation in your training code, in order to select the tensor collections you want to save, the frequency at which you want to save them, and whether you want to save the values themselves or a reduction (mean, average, etc.). For this purpose, the SageMaker Debugger SDK provides simple APIs for each framework that it supports. Let me show you how this works with a simple TensorFlow script, trying to fit a 2-dimension linear regression model. Of course, you’ll find more examples in this Github repository. Let’s take a look at the initial code: import argparse import numpy as np import tensorflow as tf import random parser = argparse.ArgumentParser() parser.add_argument('--model_dir', type=str, help="S3 path for the model") parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001) parser.add_argument('--steps', type=int, help="Number of steps to run", default=100) parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0) args = parser.parse_args() with tf.name_scope('initialize'): # 2-dimensional input sample x = tf.placeholder(shape=(None, 2), dtype=tf.float32) # Initial weights: [10, 10] w = tf.Variable(initial_value=[[10.], [10.]], name='weight1') # True weights, i.e. the ones we're trying to learn w0 = [[1], [1.]] with tf.name_scope('multiply'): # Compute true label y = tf.matmul(x, w0) # Compute "predicted" label y_hat = tf.matmul(x, w) with tf.name_scope('loss'): # Compute loss loss = tf.reduce_mean((y_hat - y) ** 2, name="loss") optimizer = tf.train.AdamOptimizer(args.lr) optimizer_op = optimizer.minimize(loss) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(args.steps): x_ = np.random.random((10, 2)) * args.scale _loss, opt = sess.run([loss, optimizer_op], {x: x_}) print (f'Step={i}, Loss={_loss}') Let’s train this script using the TensorFlow Estimator. I’m using SageMaker local mode, which is a great way to quickly iterate on experimental code. bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000} estimator = TensorFlow( role=sagemaker.get_execution_role(), base_job_name='debugger-simple-demo', train_instance_count=1, train_instance_type='local', entry_point='script-v1.py', framework_version='1.13.1', py_version='py3', script_mode=True, hyperparameters=bad_hyperparameters) Looking at the training log, things did not go well. Step=0, Loss=7.883463958023267e+23 algo-1-hrvqg_1 | Step=1, Loss=9.502028841062608e+23 algo-1-hrvqg_1 | Step=2, Loss=nan algo-1-hrvqg_1 | Step=3, Loss=nan algo-1-hrvqg_1 | Step=4, Loss=nan algo-1-hrvqg_1 | Step=5, Loss=nan algo-1-hrvqg_1 | Step=6, Loss=nan algo-1-hrvqg_1 | Step=7, Loss=nan algo-1-hrvqg_1 | Step=8, Loss=nan algo-1-hrvqg_1 | Step=9, Loss=nan Loss does not decrease at all, and even goes to infinity… This looks like an exploding tensor problem, which is one of the built-in rules defined in SageMaker Debugger. Let’s get to work. Using the Amazon SageMaker Debugger SDK In order to capture tensors, I need to instrument the training script with: A SaveConfig object specifying the frequency at which tensors should be saved, A SessionHook object attached to the TensorFlow session, putting everything together and saving required tensors during training, An (optional) ReductionConfig object, listing tensor reductions that should be saved instead of full tensors, An (optional) optimizer wrapper to capture gradients. Here’s the updated code, with extra command line arguments for SageMaker Debugger parameters. import argparse import numpy as np import tensorflow as tf import random import smdebug.tensorflow as smd parser = argparse.ArgumentParser() parser.add_argument('--model_dir', type=str, help="S3 path for the model") parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001 ) parser.add_argument('--steps', type=int, help="Number of steps to run", default=100 ) parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0 ) parser.add_argument('--debug_path', type=str, default='/opt/ml/output/tensors') parser.add_argument('--debug_frequency', type=int, help="How often to save tensor data", default=10) feature_parser = parser.add_mutually_exclusive_group(required=False) feature_parser.add_argument('--reductions', dest='reductions', action='store_true', help="save reductions of tensors instead of saving full tensors") feature_parser.add_argument('--no_reductions', dest='reductions', action='store_false', help="save full tensors") args = parser.parse_args() args = parser.parse_args() reduc = smd.ReductionConfig(reductions=['mean'], abs_reductions=['max'], norms=['l1']) if args.reductions else None hook = smd.SessionHook(out_dir=args.debug_path, include_collections=['weights', 'gradients', 'losses'], save_config=smd.SaveConfig(save_interval=args.debug_frequency), reduction_config=reduc) with tf.name_scope('initialize'): # 2-dimensional input sample x = tf.placeholder(shape=(None, 2), dtype=tf.float32) # Initial weights: [10, 10] w = tf.Variable(initial_value=[[10.], [10.]], name='weight1') # True weights, i.e. the ones we're trying to learn w0 = [[1], [1.]] with tf.name_scope('multiply'): # Compute true label y = tf.matmul(x, w0) # Compute "predicted" label y_hat = tf.matmul(x, w) with tf.name_scope('loss'): # Compute loss loss = tf.reduce_mean((y_hat - y) ** 2, name="loss") hook.add_to_collection('losses', loss) optimizer = tf.train.AdamOptimizer(args.lr) optimizer = hook.wrap_optimizer(optimizer) optimizer_op = optimizer.minimize(loss) hook.set_mode(smd.modes.TRAIN) with tf.train.MonitoredSession(hooks=[hook]) as sess: for i in range(args.steps): x_ = np.random.random((10, 2)) * args.scale _loss, opt = sess.run([loss, optimizer_op], {x: x_}) print (f'Step={i}, Loss={_loss}') I also need to modify the TensorFlow Estimator, to use the SageMaker Debugger-enabled training container and to pass additional parameters. bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1} from sagemaker.debugger import Rule, rule_configs estimator = TensorFlow( role=sagemaker.get_execution_role(), base_job_name='debugger-simple-demo', train_instance_count=1, train_instance_type='ml.c5.2xlarge', image_name=cpu_docker_image_name, entry_point='script-v2.py', framework_version='1.15', py_version='py3', script_mode=True, hyperparameters=bad_hyperparameters, rules = [Rule.sagemaker(rule_configs.exploding_tensor())] ) estimator.fit() 2019-11-27 10:42:02 Starting - Starting the training job... 2019-11-27 10:42:25 Starting - Launching requested ML instances ********* Debugger Rule Status ********* * * ExplodingTensor: InProgress * **************************************** Two jobs are running: the actual training job, and a debug job checking for the rule defined in the Estimator. Quickly, the debug job fails! Describing the training job, I can get more information on what happened. description = client.describe_training_job(TrainingJobName=job_name) print(description['DebugRuleEvaluationStatuses'][0]['RuleConfigurationName']) print(description['DebugRuleEvaluationStatuses'][0]['RuleEvaluationStatus']) ExplodingTensor IssuesFound Let’s take a look at the saved tensors. Exploring Tensors I can easily grab the tensors saved in S3 during the training process. s3_output_path = description["DebugConfig"]["DebugHookConfig"]["S3OutputPath"] trial = create_trial(s3_output_path) Let’s list available tensors. trial.tensors() ['loss/loss:0', 'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0', 'initialize/weight1:0'] All values are numpy arrays, and I can easily iterate over them. tensor = 'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0' for s in list(trial.tensor(tensor).steps()): print("Value: ", trial.tensor(tensor).step(s).value) Value: [[1.1508383e+23] [1.0809098e+23]] Value: [[1.0278440e+23] [1.1347468e+23]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] Value: [[nan] [nan]] As tensor names include the TensorFlow scope defined in the training code, I can easily see that something is wrong with my matrix multiplication. # Compute true label y = tf.matmul(x, w0) # Compute "predicted" label y_hat = tf.matmul(x, w) Digging a little deeper, the x input is modified by a scaling parameter, which I set to 100000000000 in the Estimator. The learning rate doesn’t look sane either. Bingo! x_ = np.random.random((10, 2)) * args.scale bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1} As you probably knew all along, setting these hyperparameters to more reasonable values will fix the training issue. Now Available! We believe Amazon SageMaker Debugger will help you find and solve training issues quicker, so it’s now your turn to go bug hunting. Amazon SageMaker Debugger is available today in all commercial regions where Amazon SageMaker is available. Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts. - Julien    

Pages

Recommended Content