Amazon Web Services Blog

How AWS helps our Customers to go Global – Report from Korea

Amazon Web Services Korea LLC (AWS Korea) opened an office in Seoul, South Korea in 2012. This office has educated and supported many customers from startups to large enterprises. Owing to high customer demand, we launched our Asia Pacific (Seoul) Region with 2 Availability Zones and two edge locations in January 2016. This Region has given AWS customers in Korea low-latency access to our suite of AWS infrastructure services. Andy Jassy, CEO of Amazon Web Services announced to launch Seoul Region in AWS Cloud 2016. Following this launch, Amazon CloudFront announced two new Edge locations and one Edge cache: the third in May 2016, and the fourth in Feb 2018. CloudFront’s expansion across Korea further improves the availability and performance of content delivery to users in the region. Today I am happy to announce that AWS added a third Availability Zone (AZ) to the AWS Asia Pacific (Seoul) Region to support the high demand of our growing Korean customer base. This third AZ provides customers with additional flexibility to architect scalable, fault-tolerant, and highly available applications in AWS Asia Pacific (Seoul), and will support additional AWS services in Korea. This launch brings AWS’s global AZ total to 66 AZs within 21 geographic Regions around the world. AZs located in AWS Regions consist of one or more discrete data centers, each with redundant power, networking, and connectivity, and each housed in separate facilities. Now AWS serves tens of thousands of active customers in Korea, ranging from startups and enterprises to educational institutions. One of the examples that reflects this demand is AWS Summit Seoul 2019, a part of our commitment to investing in education. More than 16,000 builders attended, a greater than tenfold increase from the 1,500 attendees of our first Summit in 2015. AWS Summit 2018 – a photo of keynote by Dr. Werner Vogels, CTO of Amazon.com So, how have Korean customers migrated to the AWS Cloud and what has motivated them? They have learned that the AWS Cloud is the new normal in the IT industry and quick adoption to their business has allowed them to regain global competitiveness. Let us look at some examples of how our customers are utilizing the benefit of the broad and deep AWS Cloud platform in the global market by replicating their services in Korea. Do you know Korean Wave? The Korean Wave represents the increase in global popularity of South Korean culture such as Korean Pop and Drama. The top three broadcasting companies in Korea (KBS, MBC, and SBS) use AWS. They co-invested to found Content Alliance Platform (CAP) that launched POOQ, which offers real-time OTT broadcasting to 600,000+ subscribers for TV programs including popular K-Dramas and has been able to reduce the buffer times on its streaming services by 20 percents. CAP also used AWS’s video processing and delivery services to stream Korea’s largest sports event, the PyeongChang 2018 Olympic Winter Games. Lots of K-Pop fans from KCON Concert 2016 in France – Wikipedia SM Entertainment, a South Korean entertainment company to lead K-Pop influences with NCT 127, EXO, Super Junior, and Girls’ Generation. The company uses AWS to deliver its websites and mobile applications. By using AWS, the company was able to scale to support more than 3 million new users of EXO-L mobile app in three weeks. The company also developed its mobile karaoke app, Everysing, on AWS, saving more than 50 percent in development costs. The scalability, flexibility, and pay-as-you-go pricing of AWS encouraged them to develop more mobile apps. Global Enterprises on the Cloud Korean Enterprises rapidly adopted AWS cloud to offer scalable global scale services as well as focus on their own business needs. Samsung Electronics uses the breadth of AWS services to save infrastructure costs and achieve rapid deployments, which provides high availability to customers and allows them to scale their services globally to support Galaxy customers worldwide. For example, Samsung Electronics increased reliability and reduced costs by 40 percent within a year after migrating its 860TB Samsung Cloud database to AWS. Samsung chose Amazon DynamoDB for its stability, scalability, and low latency to maintain the database used by 300 million Galaxy smartphone users worldwide. LG Electronics has selected AWS to run its mission-critical services for more than 35 million LG Smart TVs across the globe to handle the dramatic instant traffic peaks that come with broadcasting live sports events such as the World Cup and Olympic Games. Also, it built a new home appliance IoT platform called ThinQ. LG Electronics uses a serverless architecture and secure provisioning on AWS to reduce the development costs for this platform by 80 percent through increased efficiency in managing its developer and maintenance resources. Recently Korean Air decided to move its entire infrastructure to AWS over the next three years – including its website, loyalty program, flight operations, and other mission-critical operations — and will shut down its data centers after this migration. “This will enable us to bring new services to market faster and more efficiently, so that customer satisfaction continues to increase.” said Kenny Chang, CIO of Korean Air. AWS Customers in Korea – From Startups to Enterprises in each industries AI/ML on Traditional Manufacturers AWS is helping Korean manufacturing companies realize the benefits of digitalization and regain global competitiveness by leveraging over collective experience gained from working with customers and partners around the world. Kia Motors produces three million vehicles a year to customers worldwide. It uses Amazon Rekognition and Amazon Polly to develop a car log-in feature using face analysis and voice services. Introduced in CES 2018, this system welcomes drivers and adjusts settings such as seating, mirrors and in-vehicle infotainment based on individual preferences to create a personalized driving experience. Coway, a Korean home appliance company uses AWS for IoCare, its IoT service for tens of thousands of air & water purifiers. It migrated IoCare from on-premises to AWS for speed and efficiency to handle increasing traffic as their business grew. Coway uses AWS managed services such as AWS IoT, Amazon Kinesis, Amazon DynamoDB, AWS Lambda, Amazon RDS, and Amazon ElastiCache, which also integrated Alexa Skills with AWS Lambda with their high-end air purifier Airmega for the global market. Play Amazing Games AWS has transformed the nature of Korean gaming companies, allowing them to autonomously launch and expand their businesses globally without help from local publishers. As a result, the top 15 gaming companies in Korea are currently using AWS, including Nexon, NC Soft, Krafton, Netmarble, and KaKao Games. Krafton is the developer of the hit video game Player Unknown’s Battle Grounds (PUBG), which was developed on AWS in less than 18 months. The game uses AWS Lambda, Amazon SQS, and AWS CodeDeploy for its core backend service, Amazon DynamoDB as its primary game database, and Amazon Redshift as its data analytics platform. PUBG broke records upon release, with more than 3 million concurrent players connected to the game. Nexon, a top Korean gaming company to produce top mobile games such as Heroes of Incredible Tales (HIT). They achieved cost savings of more than 30 percent for global infrastructure management and can now launch new games quicker by using AWS. Nexon uses Amazon DynamoDB for its game database and first started using AWS to respond to unpredictable spikes in user demand. Startups to go Global Lots of hot startups in Korea are using AWS to grow the local market, but here are great examples to go global although they are based on Korea. Azar is Hyperconnect’s video-based social discovery mobile app recorded 300 million downloads and now widely accessible in over 200 countries around the world with 20 billion cumulative matches in last year. Overcoming complex matching issues for reliable video chats between users, Hyperconnect utilizes various AWS services efficiently, which uses Amazon EC2, Amazon RDS, and Amazon SES to save cost managing global infra, and Amazon S3 and Amazon CloudFront to store and deliver service data to global users faster. They also use Amazon EMR to manage the vast amount of data generated by 40 million matches per day. SendBird provides chat APIs and messaging SDK in more than 10 thousand apps globally processing about 700 million messages per month. It uses AWS global regions to provide a top-class customer experience by keeping low latency under 100 ms everywhere in the world. Amazon ElastiCache is currently used to handle large volumes of chat data, and all the data are stored in the encrypted Amazon Aurora for integrity and reliability. Server log data are analyzed and processed using the Amazon Kinesis Data Firehose as well as Amazon Athena. Freedom to Local Financial Industry We also see Korean enterprises in the financial services industry leverage AWS to digitally transform their businesses by using data analytics, fintech, and digital banking initiatives. Financial services companies in Korea are leveraging AWS to deliver an enhanced customer experience, and examples of these customers include Shinhan Financial Group, KB Kookmin Bank, Kakao Pay, Mirae Asset, and Yuanta Securities. Shinhan Financial Group achieved a 50 percent cost reduction and a 20 percent response-time reduction after migrating its North American and Japanese online banking services to AWS. Shinhan’s new Digital Platform unit now uses Amazon ECS, Amazon CloudFront, and other services to reduce development time for new applications by 50 percent. Shinhan is currently pursuing an all-in migration to AWS including moving more than 150 workloads. Hyundai Card, a top Korean credit card company and a financial subsidiary of the Hyundai Kia Motor Group, built a dev/test platform called Playground on AWS to prototype new software and services by the development team. The customer uses Amazon EMR, AWS Glue, and Amazon Kinesis for cost and architecture optimization. It allowed quick testing of new projects without waiting for resource allocation from on-premises infrastructure, reducing the development period by 3-4 months Security and Compliance At AWS, the security, privacy, and protection of customer data always come first, which AWS provides local needs as well as global security and compliances. Our most recent example of this commitment is that AWS became the first global cloud service provider to achieve the Korea-Information Security Management System certification (K-ISMS) in December 2017. With this certification, enterprises and organizations across Korea are able to meet its compliance requirements more effectively and accelerate business transformation by using best-in-class technology delivered from the highly secure and reliable AWS Cloud. AWS also completed its first annual surveillance audit for the K-ISMS certification in 2018. In April 2019, AWS achieved the Multi-Tier Cloud Security Standard (MTCS) Level-3 certification for Seoul region. AWS is also the first cloud service provider in Korea to do so. With the MTCS, FSI customers in Korea can accelerate cloud adoption by no longer having to validate 109 controls, as required in the relevant regulations (Financial Security Institute’s Guideline on Use of Cloud Computing Services in Financial Industry and the Regulation on Supervision on Electronic Financial Transactions (RSEFT). AWS also published a workbook for Korean FSI customer, covering those and 32 additional controls from the RSEFT. What to support and enable Korean customers AWS Korea has made significant investments in education and training in Korea. Tens of thousands of people including IT professionals, developers, and students have been trained in AWS cloud skills over the last two years. AWS Korea also supports community-driven activities to enhance the developer ecosystem of cloud computing in Korea. To date, the AWS Korean User Group has tens of thousands of members, who hold hundreds of meetups across Korea annually. AWS Educate program is expected to accelerate Korean students’ capabilities in cloud computing skills, helping them acquire cloud expertise that is becoming increasingly relevant for their future employment. Tens of universities including Sogang University, Yonsei University, and Seoul National University have joined this program with thousands of students participating in AWS-related classes and non-profit e-learning programs such as Like a Lion, a non-profit organization that teaches coding to students. AWS is building a vibrant cloud ecosystem with hundreds of partners ― Systems Integrator (SI) partners include LG CNS, Samsung SDS, Youngwoo Digital, Saltware, NDS, and many others. Among them, Megazone, GS Neotek, and Bespin Global are AWS Premier Consulting Partners. Independent Software Vendor (ISV) partners include AhnLab, Hancom, SK Infosec, SendBird, and IGAWorks. They help our customers to enable AWS services in their workloads to migrate from on-premise or launch new services. The customer’s celebration whiteboard for 5th anniversary of AWS Summit Seoul Finally, I want to introduce lots of customer’s feedback in our whiteboard of AWS Summit 2019 although they were written in Korean. Here is one voice from them ― “It made me decide to become an AWS customer voluntary to climb on the shoulders of the giant to see the world.” We always will hear customer’s voices and build the broadest and deepest cloud platform for them to leverage ours and be successful in both Korea and global market. – Channy Yun; This article was translated into Korean(한국어) in AWS Korea Blog.

Amazon S3 Path Deprecation Plan – The Rest of the Story

Last week we made a fairly quiet (too quiet, in fact) announcement of our plan to slowly and carefully deprecate the path-based access model that is used to specify the address of an object in an S3 bucket. I spent some time talking to the S3 team in order to get a better understanding of the situation in order to write this blog post. Here’s what I learned… We launched S3 in early 2006. Jeff Bezos’ original spec for S3 was very succinct – he wanted malloc (a key memory allocation function for C programs) for the Internet. From that starting point, S3 has grown to the point where it now stores many trillions of objects and processes millions of requests per second for them. Over the intervening 13 years, we have added many new storage options, features, and security controls to S3. Old vs. New S3 currently supports two different addressing models: path-style and virtual-hosted style. Let’s take a quick look at each one. The path-style model looks like either this (the global S3 endpoint): https://s3.amazonaws.com/jbarr-public/images/ritchie_and_thompson_pdp11.jpeg https://s3.amazonaws.com/jeffbarr-public/classic_amazon_door_desk.png Or this (one of the regional S3 endpoints): https://s3-us-east-2.amazonaws.com/jbarr-public/images/ritchie_and_thompson_pdp11.jpeg https://s3-us-east-2.amazonaws.com/jeffbarr-public/classic_amazon_door_desk.png In this example, jbarr-public and jeffbarr-public are bucket names; /images/ritchie_and_thompson_pdp11.jpeg and /jeffbarr-public/classic_amazon_door_desk.png are object keys. Even though the objects are owned by distinct AWS accounts and are in different S3 buckets (and possibly in distinct AWS regions), both of them are in the DNS subdomain s3.amazonaws.com. Hold that thought while we look at the equivalent virtual-hosted style references (although you might think of these as “new,” they have been around since at least 2010): https://jbarr-public.s3.amazonaws.com/images/ritchie_and_thompson_pdp11.jpeg https://jeffbarr-public.s3.amazonaws.com/classic_amazon_door_desk.png These URLs reference the same objects, but the objects are now in distinct DNS subdomains (jbarr-public.s3.amazonaws.com and jeffbarr-public.s3.amazonaws.com, respectively). The difference is subtle, but very important. When you use a URL to reference an object, DNS resolution is used to map the subdomain name to an IP address. With the path-style model, the subdomain is always s3.amazonaws.com or one of the regional endpoints; with the virtual-hosted style, the subdomain is specific to the bucket. This additional degree of endpoint specificity is the key that opens the door to many important improvements to S3. Out with the Old In response to feedback on the original deprecation plan that we announced last week, we are making an important change. Here’s the executive summary: Original Plan – Support for the path-style model ends on September 30, 2020. Revised Plan – Support for the path-style model continues for buckets created on or before September 30, 2020. Buckets created after that date must be referenced using the virtual-hosted model. We are moving to virtual-hosted references for two reasons: First, anticipating a world with billions of buckets homed in many dozens of regions, routing all incoming requests directly to a small set of endpoints makes less and less sense over time. DNS resolution, scaling, security, and traffic management (including DDoS protection) are more challenging with this centralized model. The virtual-hosted model reduces the area of impact (which we call the “blast radius” internally) when problems arise; this helps us to increase availability and performance. Second, the team has a lot of powerful features in the works, many of which depend on the use of unique, virtual-hosted style subdomains. Moving to this model will allow you to benefit from these new features as soon as they are announced. For example, we are planning to deprecate some of the oldest security ciphers and versions (details to come later). The deprecation process is easier and smoother (for you and for us) if you are using virtual-hosted references. In With the New As just one example of what becomes possible when using virtual-hosted references, we are thinking about providing you with increased control over the security configuration (including ciphers and cipher versions) for each bucket. If you have ideas of your own, feel free to get in touch. Moving Ahead Here are some things to know about our plans: Identifying Path-Style References – You can use S3 Access Logs (look for the Host Header field) and AWS CloudTrail Data Events (look for the host element of the requestParameters entry) to identify the applications that are making path-style requests. Programmatic Access – If your application accesses S3 using one of the AWS SDKs, you don’t need to do anything, other than ensuring that your SDK is current. The SDKs already use virtual-hosted references to S3, except if the bucket name contains one or more “.” characters. Bucket Names with Dots – It is important to note that bucket names with “.” characters are perfectly valid for website hosting and other use cases. However, there are some known issues with TLS and with SSL certificates. We are hard at work on a plan to support virtual-host requests to these buckets, and will share the details well ahead of September 30, 2020. Non-Routable Names – Some characters that are valid in the path component of a URL are not valid as part of a domain name. Also, paths are case-sensitive, but domain and subdomain names are not. We’ve been enforcing more stringent rules for new bucket names since last year. If you have data in a bucket with a non-routable name and you want to switch to virtual-host requests, you can use the new S3 Batch Operations feature to move the data. However, if this is not a viable option, please reach out to AWS Developer Support. Documentation – We are planning to update the S3 Documentation to encourage all developers to build applications that use virtual-host requests. The Virtual Hosting documentation is a good starting point. We’re Here to Help The S3 team has been working with some of our customers to help them to migrate, and they are ready to work with many more. Our goal is to make this deprecation smooth and uneventful, and we want to help minimize any costs you may incur! Please do not hesitate to reach out to us if you have questions, challenges, or concerns. — Jeff; PS – Stay tuned for more information on tools and other resources.

New – The Next Generation (I3en) of I/O-Optimized EC2 Instances

Amazon’s Customer Obsession leadership principle says: Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers. Starting from the customer and working backwards means that we do not invent in a vacuum. Instead, we speak directly to our customers (both external and internal), ask detailed questions, and pay attention to what we learn. On the AWS side, we often hear about new use cases that help us to get a better understanding of what our customers are doing with AWS. For example, large-scale EC2 users provide us with another set of interesting data points, often expressed in terms of ratios between dollars, vCPUs, memory size, storage size, and networking throughput. We launched the I3 instances (Now Available – I3 Instances for Demanding, I/O Intensive Workloads) just about two years ago. Our customers use them to host distributed file systems, relational & NoSQL databases, in-memory caches, key-value stores, data warehouses, and MapReduce clusters. Because our customers are always (in Jeff Bezos’ words) “divinely discontent”, they want I/O-optimized instances with even more power & storage. To be specific, they have asked us for: A lower price per TB of storage Increased storage density to allow consolidation of workloads and scale-up processing A higher ratio of network bandwidth and instance storage to vCPUs The crucial element here is that our customers were able to express their needs in a detailed and specific manner. Simply asking for something to be better, faster, and cheaper does not help us to make well-informed decisions. New I3en Instances Today I am happy to announce the I3en instances. Designed to meet these needs and to do an even better job of addressing the use cases that I mentioned above, these instances are powered by AWS-custom Intel Xeon Scalable (Skylake) processors with 3.1 GHz sustained all-core turbo performance, up to 60 TB of fast NVMe storage, and up to 100 Gbps of network bandwidth. Here are the specs: Instance Name vCPUs Memory Local Storage (NVMe SSD) Random Read IOPS (4 K Block) Read Throughput (128 K Block) EBS-Optimized Bandwidth Network Bandwidth i3en.large 2 16 GiB 1 x 1.25 TB 42.5 K 325 MB/s Up to 3,500 Mbps Up to 25 Gbps i3en.xlarge 4 32 GiB 1 x 2.50 TB 85 K 650 MB/s Up to 3,500 Mbps Up to 25 Gbps i3en.2xlarge 8 64 GiB 2 x 2.50 TB 170 K 1.3 GB/s Up to 3,500 Mbps Up to 25 Gbps i3en.3xlarge 12 96 GiB 1 x 7.5 TB 250 K 2 GB/s Up to 3,500 Mbps Up to 25 Gbps i3en.6xlarge 24 192 GiB 2 x 7.5 TB 500 K 4 GB/s 3,500 Mbps 25 Gbps i3en.12xlarge 48 384 GiB 4 x 7.5 TB 1 M 8 GB/s 7,000 Mbps 50 Gbps i3en.24xlarge 96 768 GiB 8 x 7.5 TB 2 M 16 GB/s 14,000 Mbps 100 Gbps In comparison to the I3 instances, the I3en instances offer: A cost per GB of SSD instance storage that is up to 50% lower Storage density (GB per vCPU) that is roughly 2.6x greater Ratio of network bandwidth to vCPUs that is up to 2.7x greater You will need HVM AMIs with the NVMe 1.0e and ENA drivers. You can also make use of the new Elastic Fabric Adapter (EFA) if you are using the i3en.24xlarge (read my recent post to learn more). Now Available You can launch I3en instances today in the US East (N. Virginia), US West (Oregon), and Europe (Ireland) Regions in On-Demand and Spot form. Reserved Instances, Dedicated Instances, and Dedicated Hosts are available. — Jeff;    

Learn about AWS Services & Solutions – May AWS Online Tech Talks

Join us this May to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now! Note – All sessions are free and in Pacific Time. Tech talks this month: Compute May 30, 2019 | 11:00 AM – 12:00 PM PT – Your First HPC Cluster on AWS – Get a step-by-step walk-through of how to set up your first HPC cluster on AWS. Containers May 20, 2019 | 11:00 AM – 12:00 PM PT – Managing Application Deployments with the AWS Service Operator – Learn how the AWS Service Operator helps you provision AWS resources and your applications using kubectl CLI. May 22, 2019 | 9:00 AM – 10:00 AM PT – Microservice Deployment Strategies with AWS App Mesh – Learn how AWS App Mesh makes the process of deploying microservices on AWS easier and more reliable while providing you with greater control and visibility to support common deployment patterns. Data Lakes & Analytics May 20, 2019 | 9:00 AM – 10:00 AM PT – EKK is the New ELK: Aggregating, Analyzing and Visualizing Logs – Learn how to aggregate, analyze, & visualize your logs with Amazon Elasticsearch Service, Amazon Kinesis Data Firehose, and Kibana May 22, 2019 | 11:00 AM – 12:00 PM PT – Building a Data Streaming Application Leveraging Apache Flink – Learn how to build and manage a real-time streaming application with AWS using Java and leveraging Apache Flink. Databases May 20, 2019 | 1:00 PM – 2:00 PM PT – Migrating to Amazon DocumentDB – Learn how to migrate MongoDB workloads to Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available. May 29, 2019 | 1:00 PM – 2:00 PM PT – AWS Fireside Chat: Optimize Your Business Continuity Strategy with Aurora Global Database – Join us for a discussion with the General Manager of Amazon Aurora to learn how you can use Aurora Global Database to build scalable and reliable apps. DevOps May 21, 2019 | 9:00 AM – 10:00 AM PT – Infrastructure as Code Testing Strategies with AWS CloudFormation – Learn about CloudFormation testing best practices, including what tools to use and when, both while authoring code and while testing in a continuous integration and continuous delivery pipeline. End-User Computing May 23, 2019 | 1:00 PM – 2:00 PM PT – Managing Amazon Linux WorkSpaces at Scale using AWS OpsWorks for Chef Automate – Learn how to simplify your Amazon Linux Workspaces management at scale by integrating with AWS OpsWorks for Chef Automate. Enterprise & Hybrid May 28, 2019 | 11:00 AM – 12:00 PM PT – What’s New in AWS Landing Zone – Learn how the AWS Landing Zone can automate the setup of best practice baselines when setting up new AWS environments. IoT May 29, 2019 | 11:00 AM – 12:30 PM PT – Customer Showcase: Extending Machine Learning to Industrial IoT Applications at the Edge – Learn how AWS IoT customers are building smarter products and bringing them to market quickly with Amazon FreeRTOS and AWS IoT Greengrass. Machine Learning May 28, 2019 | 9:00 AM – 10:00 AM PT – Build a Scalable Architecture to Automatically Extract and Import Form Data – Learn how to build a scalable architecture to process thousands of forms or documents with Amazon Textract. May 30, 2019 | 1:00 PM – 2:00 PM PT – Build Efficient and Accurate Recommendation Engines with Amazon Personalize – Learn how to build efficient and accurate recommendation engines with high impact to the bottom line of your business. Mobile May 21, 2019 | 1:00 PM – 2:00 PM PT – Deploying and Consuming Serverless Functions with AWS Amplify – Learn how to build and deploy serverless applications using React and AWS Amplify. Networking & Content Delivery May 31, 2019 | 1:00 PM – 2:00 PM PT – Simplify and Scale How You Connect Your Premises to AWS with AWS Direct Connect on AWS Transit Gateway – Learn how to use multi account Direct Connect gateway to interface your on-premises network with your AWS network through a AWS Transit Gateway. Security, Identity, & Compliance May 30, 2019 | 9:00 AM – 10:00 AM PT – Getting Started with Cross-Account Encryption Using AWS KMS, Featuring Slack Enterprise Key Management – Learn how to manage third-party access to your data keys with AWS KMS. May 31, 2019 | 9:00 AM – 10:00 AM PT – Authentication for Your Applications: Getting Started with Amazon Cognito – Learn how to use Amazon Cognito to add sign up and sign in to your web or mobile application. Serverless May 22, 2019 | 1:00 PM – 2:00 PM PT – Building Event-Driven Serverless Apps with AWS Event Fork Pipelines – Learn how to use Event Fork Pipelines, a new suite of open-source nested applications in the serverless application repository, to easily build event driven apps. Storage May 28, 2019 | 1:00 PM – 2:00 PM PT – Briefing: AWS Hybrid Cloud Storage and Edge Computing – Learn about AWS hybrid cloud storage and edge computing capabilities. May 23, 2019 | 11:00 AM – 12:00 PM PT – Managing Tens to Billions of Objects at Scale with S3 Batch Operations – Learn about S3 Batch Operations and how to manage billions of objects with a single API request in a few clicks.

SAP on AWS Update – Customer Case Studies, Scale-Up, Scale-Out, and More

SAP SAPPHIRE NOW 2019 takes place this week in Florida! Many of my AWS colleagues will be there, and they would love to talk to you. Today, I would like to share some customer success stories and give you a brief update on the work that we have been doing to make sure that AWS is the best place for you to run your SAP-powered OLTP and OLAP enterprise workloads. Customer Update Let’s start with a quick recap of some recent customer success stories. Here are just a few of the many customers that are using SAP on AWS in production today: Fiat Chrysler Automotive – After exploring multiple options and vendors, FIAT decided to deploy SAP on AWS with Capgemini as a partner: Engie – Read the case study to learn how this international energy provider has been able to Transform and Streamline their Financial Processes and drastically reduced the ramp-up time for new users from three days to one day by running SAP S/4HANA on AWS: AIG – Watch the video to learn how AIG migrated 13 SAP landscapes from an on-premises environment to SAP HANA on AWS in 13 months, while reducing their infrastructure cost by $8M: Sumitomo Chemical – Read this case study to learn how Sumitomo Chemical runs a huge number of SAP ERP batch jobs on AWS, cutting job processing time by around 40%: There are additional SAP on AWS Case Studies for your reading pleasure! AWS customers are making great use of the 6, 9, and 12 TB EC2 High Memory instances that we launched last year. They are building many different SAP Solutions on AWS, taking advantage of the SAP Rapid Migration Test Program, and working with members of the SAP Competency Partner Network. What’s New Our customers are building ever-larger SAP installations, using both scale-up (larger instances) or scale-out (more instances) models. We have been working with SAP to certify two additional scale-out options: 48 TB Scale-Out (S/4HANA) – When we launched the EC2 High Memory instances with 12 TB of memory last year, they were certified by SAP to run OLTP and OLAP HANA workloads in scale-up configurations. These instances now support additional configuration choices for your OLTP workloads. You can now use up to four of these 12 TB High Memory instances to run an OLTP S/4HANA solution in scale-out mode, while meeting all SAP requirements. This is the first ever SAP-certified scale-out certification of S/4HANA on cloud instances. SAP recommends (SAP OSS Note 2408419) the use of bare metal platforms with a minimum of 8 CPUs and 6 TB of memory for running S/4HANA in scale-out. Since the EC2 High Memory instances with 12 TB memory is an EC2 bare metal instance that combines the benefits of the cloud with the performance characteristics of a bare metal platform, it is able to support SAP-certified scale-out configurations for S/4HANA in the cloud. To learn more, read Announcing support for extremely large S/4HANA deployments on AWS and review the certification. 100 TB Scale-Out (BW4/HANA, BW on HANA, Datamart) – You can now use up to 25 x1e.32xlarge EC2 instances (thanks to TDI Phase 5) to create an OLAP solution that scales to 100 TB, again while meeting all SAP requirements. You can start with as little as 244 GB of memory and scale out to 100 TB; review the certification to learn more. The 48 TB OLTP solution and the 100 TB OLAP solution are the largest SAP-certified solutions available from any cloud provider. We also have a brand-new S/4HANA Quick Start to help you get going in minutes. It sets up a VPC that spans two Availability Zones, each with a public and private subnet, and a pair of EC2 instances. One instance hosts the primary copy of S/4HANA and the other hosts the secondary. Read the Quick Start to learn more: What’s Next Ok, still with me? I hope so, since I have saved the biggest news for last! We are getting ready to extend our lineup of EC2 High Memory instances, and will make them available with 18 TB and 24 TB of memory in the fall of 2019. The instances will use second-generation Intel® Xeon® Scalable processors, and will be available in bare metal form. Like the existing EC2 High Memory instances, you will be able to run them in the same Virtual Private Cloud (VPC) that hosts your cloud-based business applications, and you will be able to make use of important EC2 features such as Elastic Block Store and advanced networking. You can launch, manage, and even resize these EC2 instances using the AWS Command Line Interface (CLI) and the AWS SDKs. Here are screen shots of SAP HANA Studio running on 18 TB and 24 TB instances that are currently in development: And here is the output from top on those instances: Here is a handy reference to all of your scale-up and scale-out SAP HANA on AWS options: If you want to learn more or you want to gain early access to the new instances, go ahead and contact us. — Jeff;  

Building Serverless Pipelines with Amazon CloudWatch Events

Guest post by AWS Serverless Hero Forrest Brazeal. Forrest is a senior cloud architect at Trek10, Inc., host of the Think FaaS serverless podcast at Trek10, and a regular speaker at workshops and events in the serverless community. Events and serverless go together like baked beans and barbecue. The serverless mindset says to focus on code and configuration that provide business value. It turns out that much of the time, this means working with events: structured data corresponding to things that happen in the outside world. Rather than maintaining long-running server tasks that chew up resources while polling, I can create serverless applications that do work only in response to event triggers. I have lots of options when working with events in AWS: Amazon Kinesis Data Streams, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), and more, depending on my requirements. Lately, I’ve been using a service more often that has the word ‘event’ right in the name: Amazon CloudWatch Events. CloudWatch Events: The best-kept secret in serverless event processing I first knew CloudWatch as the service that collects my Lambda logs and lets me run functions on a schedule. But CloudWatch Events also lets me publish my own custom events using the CloudWatch API. It has similar pricing and delivery guarantees to SNS, and supports a whole bunch of AWS services as targets. Best of all, I don’t even have to provision the event bus—it’s just there in the CloudWatch console. I can publish an event now, using the boto3 AWS SDK for Python: import boto3 cw = boto3.client('cloudwatch') cw.put_events( Entries=[ { 'Source': 'my.app.event', 'DetailType': 'MY_EVENT_TYPE', 'Detail': '{"my_data":"As a JSON string"}' } ] ) In short, CloudWatch Events gives me a fully managed event pipe that supports an arbitrary number of consumers, where I can drop any kind of JSON string that I want. And this is super useful for building serverless apps. Event-driven architectures in action I build cloud-native solutions for clients at Trek10 daily. I frequently use event-driven architectures as a powerful way to migrate legacy systems to serverless, enable easier downstream integrations, and more. Here are just a couple of my favorite patterns: • Strangling legacy databases • Designing event-sourced applications Strangling legacy databases The “strangler pattern” hides a legacy system behind a wrapper API, while gradually migrating users to the new interface. Trek10 has written about this before. Streaming changes to the cloud as events is a great way to open up reporting and analytics use cases while taking load off a legacy database. The following diagram shows writing a legacy database to events. This pattern can also work the other way: I can write new data to CloudWatch Events, consume it into a modern data source, and create a second consumer that syncs the data back to my legacy system. Designing event-sourced applications Event sourcing simply means treating changes in the system state as events, publishing them on a ledger or bus where they can be consumed by different downstream applications. Using CloudWatch Events as a centralized bus, I can make a sanitized record of events available as shown in the following event validation flow diagram. The validation function ensures that only events that match my application’s standards get tagged as “valid” and are made available to downstream consumers. The default bus handles lots of events (remember, service logs go here!), so it’s important to set up rules that only match the events that I care about. CloudWatch Events simplifies these patterns by providing a single bus where I can apply filters and subscribe consumers, all without having to provision or maintain any infrastructure. And that’s just the beginning. Use case: Multi-account event processing with CloudWatch Events CloudWatch Events gets most interesting when I start connecting multiple AWS accounts. It’s easy to set up a trust relationship between CloudWatch Event buses in different accounts, using filtering rules to choose which events get forwarded. As an example, imagine a widget processing system for a large enterprise, AnyCompany. AnyCompany has many different development teams, each using their own AWS account. Some services are producing information about widgets as they check into warehouses or travel across the country. Others need that data to run reports or innovate new products. Suppose that Service A produces information about new widgets, Service B wants to view aggregates about widgets in real time, and Service C needs historical data about widgets for reporting. The full event flow might look like the following diagram. 1. Service A publishes the new widget event to CloudWatch Events in their AWS account with the following event body: { 'Source': 'cwi.servicea', 'DetailType': 'NEW_WIDGET', 'Detail': '{"widget_id":"abc123"}' } 2. A filtering rule forwards events tagged with cwi.servicea to the central event processing account. Using CloudFormation, they could define the rule as follows: CentralForwardingRule: Type: AWS::Events::Rule Properties: Description: Rule for sending events to central account EventPattern: source: - cwi.servicea Targets: - Arn: !Sub arn:aws:events:${CENTRAL_ACCOUNT_REGION}:${CENTRAL_ACCOUNT_ID}:event-bus/default Id: CentralTarget RoleArn: <IAM ROLE WITH ACCESS TO PUT CW EVENTS> 3. The event is validated according to their standards. 4. The valid event is republished on the central event bus with a new source, valid.cw.servicea. This is important because, to avoid infinite loops, an individual event can only be forwarded one time. 5. A filtering rule forwards the valid event to Service B’s AWS account, where it updates a DynamoDB table connected to an AWS AppSync API. 6. A second rule forwards the same event to the Service C account, where it flows through Kinesis Data Firehose to an Amazon S3 bucket for analysis using Amazon Athena. What CloudWatch Events provides here is a decoupled system that uses mostly “plug-and-play” services, and yet opens up flexibility for future innovation. Taking full advantage of the cloud The biggest reason I love CloudWatch Events? It’s a fantastically cloud-native service. There’s little code required, and no operational responsibilities beyond watching AWS service limits. I don’t even have to deploy anything to start using it. And yet, under the hood, I’m using a powerful piece of AWS infrastructure that can support complex distributed applications without breaking a sweat. That’s pretty close to the platonic ideal of serverless apps. Anytime I’m cooking up an event-driven application, I make sure to consider what CloudWatch Events can bring to the table.

New – Amazon S3 Batch Operations

AWS customers routinely store millions or billions of objects in individual Amazon Simple Storage Service (S3) buckets, taking advantage of S3’s scale, durability, low cost, security, and storage options. These customers store images, videos, log files, backups, and other mission-critical data, and use S3 as a crucial part of their data storage strategy. Batch Operations Today, I would like to tell you about Amazon S3 Batch Operations. You can use this new feature to easily process hundreds, millions, or billions of S3 objects in a simple and straightforward fashion. You can copy objects to another bucket, set tags or access control lists (ACLs), initiate a restore from Glacier, or invoke an AWS Lambda function on each one. This feature builds on S3’s existing support for inventory reports (read my S3 Storage Management Update post to learn more), and can use the reports or CSV files to drive your batch operations. You don’t have to write code, set up any server fleets, or figure out how to partition the work and distribute it to the fleet. Instead, you create a job in minutes with a couple of clicks, turn it loose, and sit back while S3 uses massive, behind-the-scenes parallelism to take care of the work. You can create, monitor, and manage your batch jobs using the S3 Console, the S3 CLI, or the S3 APIs. A Quick Vocabulary Lesson Before we get started and create a batch job, let’s review and introduce a couple of important terms: Bucket – An S3 bucket holds a collection of any number of S3 objects, with optional per-object versioning. Inventory Report – An S3 inventory report is generated each time a daily or weekly bucket inventory is run. A report can be configured to include all of the objects in a bucket, or to focus on a prefix-delimited subset. Manifest – A list (either an Inventory Report, or a file in CSV format) that identifies the objects to be processed in the batch job. Batch Action – The desired action on the objects described by a Manifest. Applying an action to an object constitutes an S3 Batch Task. IAM Role – An IAM role that provides S3 with permission to read the objects in the inventory report, perform the desired actions, and to write the optional completion report. If you choose Invoke AWS Lambda function as your action, the function’s execution role must grant permission to access the desired AWS services and resources. Batch Job – References all of the items above. Each job has a status and a priority; higher priority (numerically) jobs take precedence over those with lower priority. Running a Batch Job Ok, let’s use the S3 Console to create and run a batch job! In preparation for this blog post I enabled inventory reports for one of my S3 buckets (jbarr-batch-camera) earlier this week, with the reports routed to jbarr-batch-inventory: I select the desired inventory item, and click Create job from manifest to get started (I can also click Batch operations while browsing my list of buckets). All of the relevant information is already filled in, but I can choose an earlier version of the manifest if I want (this option is only applicable if the manifest is stored in a bucket that has versioning enabled). I click Next to proceed: I choose my operation (Replace all tags), enter the options that are specific to it (I’ll review the other operations later), and click Next: I enter a name for my job, set its priority, and request a completion report that encompasses all tasks. Then I choose a bucket for the report and select an IAM Role that grants the necessary permissions (the console also displays a role policy and a trust policy that I can copy and use), and click Next: Finally, I review my job, and click Create job: The job enters the Preparing state. S3 Batch Operations checks the manifest and does some other verification, and the job enters the Awaiting your confirmation state (this only happens when I use the console). I select it and click Confirm and run: I review the confirmation (not shown) to make sure that I understand the action to be performed, and click Run job. The job enters the Ready state, and starts to run shortly thereafter. When it is done it enters the Complete state: If I was running a job that processed a substantially larger number of objects, I could refresh this page to monitor status. One important thing to know: After the first 1000 objects have been processed, S3 Batch Operations examines and monitors the overall failure rate, and will stop the job if the rate exceeds 50%. The completion report contains one line for each of my objects, and looks like this: Other Built-In Batch Operations I don’t have enough space to give you a full run-through of the other built-in batch operations. Here’s an overview: The PUT copy operation copies my objects, with control of the storage class, encryption, access control list, tags, and metadata: I can copy objects to the same bucket to change their encryption status. I can also copy them to another region, or to a bucket owned by another AWS account. The Replace Access Control List (ACL) operation does exactly that, with control over the permissions that are granted: And the Restore operation initiates an object-level restore from the Glacier or Glacier Deep Archive storage class: Invoking AWS Lambda Functions I have saved the most general option for last. I can invoke a Lambda function for each object, and that Lambda function can programmatically analyze and manipulate each object. The Execution Role for the function must trust S3 Batch Operations: Also, the Role for the Batch job must allow Lambda functions to be invoked. With the necessary roles in place, I can create a simple function that calls Amazon Rekognition for each image: import boto3 def lambda_handler(event, context): s3Client = boto3.client('s3') rekClient = boto3.client('rekognition') # Parse job parameters jobId = event['job']['id'] invocationId = event['invocationId'] invocationSchemaVersion = event['invocationSchemaVersion'] # Process the task task = event['tasks'][0] taskId = task['taskId'] s3Key = task['s3Key'] s3VersionId = task['s3VersionId'] s3BucketArn = task['s3BucketArn'] s3Bucket = s3BucketArn.split(':')[-1] print('BatchProcessObject(' + s3Bucket + "/" + s3Key + ')') resp = rekClient.detect_labels(Image={'S3Object':{'Bucket' : s3Bucket, 'Name' : s3Key}}, MaxLabels=10, MinConfidence=85) l = [lb['Name'] for lb in resp['Labels']] print(s3Key + ' - Detected:' + str(sorted(l))) results = [{ 'taskId': taskId, 'resultCode': 'Succeeded', 'resultString': 'Succeeded' }] return { 'invocationSchemaVersion': invocationSchemaVersion, 'treatMissingKeysAs': 'PermanentFailure', 'invocationId': invocationId, 'results': results } With my function in place, I select Invoke AWS lambda function as my operation when I create my job, and choose my BatchProcessObject function: Then I create and confirm my job as usual. The function will be invoked for each object, taking advantage of Lambda’s ability to scale and allowing this moderately-sized job to run to completion in less than a minute: I can find the “Detected” messages in the CloudWatch Logs Console: As you can see from my very simple example, the ability to easily run Lambda functions on large numbers of S3 objects opens the door to all sorts of interesting applications. Things to Know I am looking forward to seeing and hearing about the use cases that you discover for S3 Batch Operations! Before I wrap up, here are some final thoughts: Job Cloning – You can clone an existing job, fine-tune the parameters, and resubmit it as a fresh job. You can use this to re-run a failed job or to make any necessary adjustments. Programmatic Job Creation – You could attach a Lambda function to the bucket where you generate your inventory reports and create a fresh batch job each time a report arrives. Jobs that are created programmatically do not need to be confirmed, and are immediately ready to execute. CSV Object Lists – If you need to process a subset of the objects in a bucket and cannot use a common prefix to identify them, you can create a CSV file and use it to drive your job. You could start from an inventory report and filter the objects based on name or by checking them against a database or other reference. For example, perhaps you use Amazon Comprehend to perform sentiment analysis on all of your stored documents. You can process inventory reports to find documents that have not yet been analyzed and add them to a CSV file. Job Priorities – You can have multiple jobs active at once in each AWS region. Your jobs with a higher priority take precedence, and can cause existing jobs to be paused momentarily. You can select an active job and click Update priority in order to make changes on the fly: Learn More Here are some resources to help you learn more about S3 Batch Operations: Documentation – Read about Creating a Job, Batch Operations, and Managing Batch Operations Jobs. Tutorial Videos – Check out the S3 Batch Operations Video Tutorials to learn how to Create a Job, Manage and Track a Job, and to Grant Permissions. Now Available You can start using S3 Batch Operations in all commercial AWS regions except Asia Pacific (Osaka) today. S3 Batch Operations is also available in both of the AWS GovCloud (US) regions. — Jeff;

Use AWS Transit Gateway & Direct Connect to Centralize and Streamline Your Network Connectivity

Last year I showed you how to Use an AWS Transit Gateway to Simplify Your Network Architecture. As I said at the time: You can connect your existing VPCs, data centers, remote offices, and remote gateways to a managed Transit Gateway, with full control over network routing and security, even if your VPCs, Active Directories, shared services, and other resources span multiple AWS accounts. You can simplify your overall network architecture, reduce operational overhead, and gain the ability to centrally manage crucial aspects of your external connectivity, including security. Last but not least, you can use Transit Gateways to consolidate your existing edge connectivity and route it through a single ingress/egress point. In that post I also promised you support for AWS Direct Connect, and I’m happy to announce that this support is available today for use in the US East (N. Virginia), US East (Ohio), US West (N. California), and US West (Oregon) Regions. The applications that you run in the AWS Cloud can now communicate with each other, and with your on-premises applications, at speeds of up to 10 Gbps per Direct Connect connection. You can set it up in minutes (assuming that you already have a dedicated or hosted connection running at 1 Gbps or more) and start using it right away. Putting it all together, you get a lot of important benefits from today’s launch: Simplification – You can simplify your network architecture and your network management overhead by creating a hub-and-spoke model that spans multiple VPCs, regions, and AWS accounts. If you go this route, you may also be in a position to cut down on the number of AWS VPN connections that you use. Consolidation – You have the opportunity to reduce the number of dedicated or hosted connections, saving money and avoiding complexity in the process. You can consolidate your connectivity so that it all flows across the same BGP session. Connectivity – You can reach your Transit Gateway using your connections from any of the 90+ AWS Direct Connect locations (except from AWS Direct Connect locations in China). Using Transit Gateway & Direct Connect I will use the freshly updated Direct Connect Console to set up my Transit Gateway for use with Direct Connect. The menu on the left lets me view and create the resources that I will need: My AWS account already has access to a 1 Gbps connection (MyConnection) to TierPoint in Seattle: I create a Direct Connect Gateway (MyDCGateway): I create a Virtual Interface (VIF) with type Transit: I reference my Direct Connect connection (MyConnection) and my Direct Connect Gateway (MyDCGateway) and click Create virtual interface: When the state of my new VIF switches from pending to down I am ready to proceed: Now I am ready to create my transit gateway (MyTransitGW). This is a VPC component; clicking on Transit gateways takes me to the VPC console. I enter a name, description, and ASN (which must be distinct from the one that I used for the Direct Connect Gateway), leave the other values as-is, and click Create Transit Gateway: The state starts out as pending, and transitions to available: With all of the resources ready, I am ready to connect them! I return to the Direct Connect Console, find my Transit Gateway, and click Associate Direct Connect gateway: I associate the Transit Gateway with a Direct Connect Gateway in my account (using another account requires the ID of the gateway and the corresponding AWS account number), and list the network prefixes that I want to advertise to the other side of the Direct Connect connection. Then I click Associate Direct Connect gateway to make it so: The state starts out as associating and transitions to associated. This can take some time, so I will take Luna for a walk: By the time we return, the Direct Connect Gateway is associated with the Transit Gateway, and we are good to go! In a real-world situation you would spend more time planning your network topology and addressing, and you would probably use multiple AWS accounts. Available Now You can use this new feature today to interface with your Transit Gateways hosted in four AWS regions. — Jeff;

New – Amazon Managed Blockchain – Create & Manage Scalable Blockchain Networks

Trust is a wonderful thing, and is the basis for almost every business and personal relationship or transaction. In some cases, trust is built up over an extended period of time, reinforced with each successful transaction and seen as an integral part of the relationship. In other situations, there’s no time to accumulate trust and other mechanisms must be used instead. The parties must find a way to successfully complete the transaction in the absence of trust. Today, emerging blockchain technologies such as Hyperledger Fabric and Ethereum fill this important need, allowing parties to come to consensus regarding the validity of a proposed transaction and create an unalterable digital record (commonly known as a ledger) of each transaction in the absence of trust. Amazon Managed Blockchain We announced Amazon Managed Blockchain at AWS re:Invent 2018 and invited you to sign up for a preview. I am happy to announce that the preview is complete and that Amazon Managed Blockchain is now available for production use in the US East (N. Virginia) Region. You can use it to create scalable blockchain networks that use the Hyperledger Fabric open source framework, with Ethereum in the works. As you will see in a minute, you can create your network in minutes. Once created, you can easily manage and maintain your blockchain network. You can manage certificates, invite new members, and scale out peer node capacity in order to process transactions more quickly. The blockchain networks that you create with Amazon Managed Blockchain can span multiple AWS accounts so that a group of members can execute transactions and share data without a central authority. New members can easily launch and configure peer nodes that process transaction requests and store a copy of the ledger. Using Amazon Managed Blockchain I can create my own scalable blockchain network from the AWS Management Console, AWS Command Line Interface (CLI) (aws managedblockchain create-network), or API (CreateNetwork). To get started, I open the Amazon Managed Blockchain Console and click Create a network: I need to choose the edition (Starter or Standard) for my network. The Starter Edition is designed for test networks and small production networks, with a maximum of 5 members per network and 2 peer nodes per member. The Standard Edition is designed for scalable production use, with up to 14 members per network and 3 peer nodes per member (check out the Amazon Managed Blockchain Pricing to learn more about both editions). I also enter a name and a description for my network: Then I establish the voting policy for my network, and click Next to move ahead (read Work with Proposals to learn more about creating and voting on proposals): Now, I need to create the first member of my network. Each member is a distinct identity within the network, and is visible within the network. I also set up a user name and password for my certificate authority, and click Next: I review my choices, and click Create network and member: My network enters the Creating status, and I take a quick break to walk my dog! When I return, my network is Available: Inviting Members Now that my network is available, I can invite members by clicking the Members tab: I can see the current members of my network, both those I own and those owned by others. I click on Propose invitation to invite a new member: Then I enter the AWS account number of the proposed member and click Create: This creates a proposal (visible to me and to the other members of the network). I click on the ID to proceed: I review the proposal, select my identity (block-wizard), and then click Yes to vote: After enough Yes votes have been received to pass the threshold that I specified when I created the network, the invitation will be extended to the new member, and will be visible in the Invitations section: If you are building a blockchain network for testing purposes and don’t have access to multiple AWS accounts, you can even invite your own account. After you do this (and vote to let yourself in), you will end up with multiple members in the same account. Using the Network Now that the network is running, and has some members, the next step is to create an endpoint in the Virtual Private Cloud (VPC) where I will run my blockchain applications (this feature is powered by AWS PrivateLink). Starting from the detail page for my network, I click Create VPC endpoint: I choose the desired VPC and the subnets within it, pick a security group, and click Create: My applications can use the VPC endpoint to communicate with my blockchain network: The next step is to build applications that make use of the blockchain. To learn how to do this, read Build and deploy an application for Hyperledger Fabric on Amazon Managed Blockchain. You can also read Get Started Creating a Hyperledger Fabric Blockchain Network Using Amazon Managed Blockchain. Things to Know As usual, we have a healthy roadmap for this new service. Stay tuned to learn more! — Jeff; PS – Check out the AWS Blockchain Pub to see a novel use for Amazon Managed Blockchain and AWS DeepLens.  

The AWS DeepRacer League Virtual Circuit is Now Open – Train Your Model Today!

AWS DeepRacer is a 1/18th scale four-wheel drive car with a considerable amount of onboard hardware and software. Starting at re:Invent 2018 and continuing with the AWS Global Summits, you have the opportunity to get hands-on experience with a DeepRacer. At these events, you can train a model using reinforcement learning, and then race it around a track. The fastest racers and their laptimes for each summit are shown on our leaderboards. New DeepRacer League Virtual Circuit Today we are launching the AWS DeepRacer League Virtual Circuit. You can build, train, and evaluate your reinforcement learning models online and compete online for some amazing prizes, all from the comfort of the DeepRacer Console! We’ll add a new track each month, taking inspiration from famous race tracks around the globe, so that you can refine your models and broaden your skill set. The top entrant in the leaderboard each month will win an expenses-paid package to AWS re:Invent 2019, where they will take part in the League Knockout Rounds, with a chance to win the Championship Cup! New DeepRacer Console We are making the DeepRacer Console available today in the US East (N. Virginia) Region. You can use it to build and train your DeepRacer models and to compete in the Virtual Circuit, while gaining practical, hands-on experience with Reinforcement Learning. Following the steps in the DeepRacer Lab that is used at the hands-on DeepRacer workshops, I open the console and click Get started: The console provides me with an overview of the model training process, and then asks to create the AWS resources needed to train and evaluate my models. I review the info and click Create resources to proceed: The resources are created in minutes (I can click Learn RL to learn more about reinforcement learning while this is happening). I click Create model to move ahead: I enter a name and a description for for my model: Then I pick a track (more tracks will appear throughout the duration of the Virtual League): Now I define the Action space for my model. This is a set of discrete actions that my model can perform. Choosing values that increase the number of options will generally enhance the quality of my model, at the cost of additional training time: Next, I define the reward function for my model. This function evaluates the current state of the vehicle throughout the training process and returns a reward value to indicate how well the model is performing (higher rewards signify better performance). I can use one of three predefined models (available by clicking Reward function examples) as-is, customize them, or build one from scratch. I’ll use Prevent zig-zag, a sample reward function that penalizes zig-zap behavior, to get started: The reward function is written in Python 3, and has access to parameters (track_width, distance_from_center, all_wheels_on_track, and many more) that describe the position and state of the car, and also provide information about the track. I also control a set of hyperparameters that affect the overall training performance. Since I don’t understand any of these (just being honest here), I will accept all of the defaults: To learn more about hyperparameters, read Systematically Tune Hyperparameters. Finally, I specify a time limit for my training job, and click Start training. In general, simple models will converge in 90 to 120 minutes, but this is highly dependent on the maximum speed and the reward function. The training job is initialized (this takes about 6 minutes), and I can track progress in the console: The training job makes use of AWS RoboMaker so I can also monitor it from the RoboMaker Console. For example, I can open the Gazebo window, see my car, and watch the training process in real time: One note of caution: changing the training environment (by directly manipulating Gazebo) will adversely affect the training run, most likely rendering it useless. As the training progresses, the Reward graph will go up and to the right (as we often say at Amazon) if the car is learning how to stay on the track: If the graph flattens out or declines precipitously and stays there, your reward function is not rewarding the desired behavior or some other setting is getting in the way. However, patience is a virtue, and there will be the occasional regression on the way to the top. After the training is complete, there’s a short pause while the new model is finalized and stored, and then it is time for me to evaluate my model by running it in a simulation. I click Start evaluation to do this: I can evaluate the model on any of the available tracks. Using one track for training and a different one for evaluation is a good way to make sure that the model is general, and has not been overfit so that it works on just one track. However, using the same track for training and testing is a good way to get started, and that’s what I will do. I select the Oval Track and 3 trials, and click Start evaluation: The RoboMaker simulator launches, with an hourly cost for the evaluation, as noted above. The results (lap times) are displayed when the simulation is complete: At this point I can evaluate my model on another track, step back and refine my model and evaluate it again, or submit my model to the current month’s Virtual Circuit track to take part in the DeepRacer League. To do that, I click Submit to virtual race, enter my racer name, choose a model, agree to the Ts and C’s, and click Submit model: After I submit, my model will be evaluated on the pre-season track and my lap time will be used to place me in the Virtual Circuit Leaderboard. Things to Know Here are a couple of things to know about the AWS DeepRacer and the AWS DeepRacer League: AWS Resources – Amazon SageMaker is used to train models, which are then stored in Amazon Simple Storage Service (S3). AWS RoboMaker provides the virtual track environment, which is used for training and evaluation. An AWS CloudFormation stack is used to create a Amazon Virtual Private Cloud, complete with subnets, routing tables, an Elastic IP Address, and a NAT Gateway. Costs – You can use the DeepRacer console at no charge. As soon as you start training your first model, you will get service credits for SageMaker and RoboMaker to give you 10 hours of free training on these services. The credits are applied at the end of the month and are available for 30 days, as part of the AWS Free Tier. The DeepRacer architecture uses a NAT Gateway that carries an availability charge. Your account will automatically receive service credits to offset this charge, showing net zero on your account. DeepRacer Cars – You can preorder your DeepRacer car now! Deliveries to addresses in the United States will begin in July 2019. — Jeff;

Now Available – Elastic Fabric Adapter (EFA) for Tightly-Coupled HPC Workloads

We announced Elastic Fabric Adapter (EFA) at re:Invent 2018 and made it available in preview form at the time. During the preview, AWS customers put EFA through its paces on a variety of tightly-coupled HPC workloads, providing us with valuable feedback and helping us to fine-tune the final product. Now Available Today I am happy to announce that EFA is now ready for production use in multiple AWS regions. It is ready to support demanding HPC workloads that need lower and more consistent network latency, along with higher throughput, than is possible with traditional TCP communication. This launch lets you apply the scale, flexibility, and elasticity of the AWS Cloud to tightly-coupled HPC apps and I can’t wait to hear what you do with it. You can, for example, scale up to thousands of compute nodes without having to reserve the hardware or the network ahead of time. All About EFA An Elastic Fabric Adapter is an AWS Elastic Network Adapter (ENA) with added capabilities (read my post, Elastic Network Adapter – High Performance Network Interface for Amazon EC2, to learn more about ENA). An EFA can still handle IP traffic, but also supports an important access model commonly called OS bypass. This model allows the application (most commonly through some user-space middleware) access the network interface without having to get the operating system involved with each message. Doing so reduces overhead and allows the application to run more efficiently. Here’s what this looks like (source): The MPI Implementation and libfabric layers of this cake play crucial roles: MPI – Short for Message Passing Interface, MPI is a long-established communication protocol that is designed to support parallel programming. It provides functions that allow processes running on a tightly-coupled set of computers to communicate in a language-independent way. libfabric – This library fits in between several different types of network fabric providers (including EFA) and higher-level libraries such as MPI. EFA supports the standard RDM (reliable datagram) and DGRM (unreliable datagram) endpoint types; to learn more, check out the libfabric Programmer’s Manual. EFA also supports a new protocol that we call Scalable Reliable Datagram; this protocol was designed to work within the AWS network and is implemented as part of our Nitro chip. Working together, these two layers (and others that can be slotted in instead of MPI), allow you to bring your existing HPC code to AWS and run it with little or no change. You can use EFA today on c5n.18xlarge and p3dn.24xlarge instances in all AWS regions where those instances are available. The instances can use EFA to communicate within a VPC subnet, and the security group must have ingress and egress rules that allow all traffic within the security group to flow. Each instance can have a single EFA, which can be attached when an instance is started or while it is stopped. You will also need the following software components: EFA Kernel Module – The EFA Driver is in the Amazon GitHub repo; read Getting Started with EFA to learn how to create an EFA-enabled AMI for Amazon Linux, Amazon Linux 2, and other popular Linux distributions. Libfabric Network Stack – You will need to use an AWS-custom version for now; again, the Getting Started document contains installation information. We are working to get our changes into the next release (1.8) of libfabric. MPI or NCCL Implementation – You can use Open MPI 3.1.3 (or later) or NCCL (2.3.8 or later) plus the OFI driver for NCCL. We also also working on support for the Intel MPI library. You can launch an instance and attach an EFA using the CLI, API, or the EC2 Console, with CloudFormation support coming in a couple of weeks. If you are using the CLI, you need to include the subnet ID and ask for an EFA, like this (be sure to include the appropriate security group): $ aws ec2 run-instances ... \ --network-interfaces DeleteOnTermination=true,DeviceIndex=0,SubnetId=SUBNET,InterfaceType=efa After your instance has launched, run lspci | grep efa0 to verify that the EFA device is attached. You can (but don’t have to) launch your instances in a Cluster Placement Group in order to benefit from physical adjacency when every light-foot counts. When used in this way, EFA can provide one-way MPI latency of 15.5 microseconds. You can also create a Launch Template and use it to launch EC2 instances (either directly or as part of an EC2 Auto Scaling Group) in On-Demand or Spot Form, launch Spot Fleets, and to run compute jobs on AWS Batch. Learn More To learn more about EFA, and to see some additional benchmarks, be sure to watch this re:Invent video (Scaling HPC Applications on EC2 w/ Elastic Fabric Adapter):     AWS Customer CFD Direct maintains the popular OpenFOAM platform for Computational Fluid Dynamics (CFD) and also produces CFD Direct From the Cloud (CFDDC), an AWS Marketplace offering that makes it easy for you to run OpenFOAM on AWS. They have been testing and benchmarking EFA and recently shared their measurements in a blog post titled OpenFOAM HPC with AWS EFA. In the post, they report on a pair of simulations: External Aerodynamics Around a Car – This simulation scales extra-linearly to over 200 cores, gradually declining to linear scaling at 1000 cores (about 100K simulation cells per core). Flow Over a Weir with Hydraulic Jump – This simulation (1000 cores and 100M cells) scales at between 67% and 72.6%, depending on a “data write” setting. Read the full post to learn more and to see some graphs and visualizations. In the Works We plan to add EFA support to additional EC2 instance types over time. In general, we plan to provide EFA support for the two largest sizes of “n” instances of any given type, and also for bare metal instances. — Jeff;  

Now Open – AWS Asia Pacific (Hong Kong) Region

The AWS Region in Hong Kong SAR is now open and you can start using it today. The official name is Asia Pacific (Hong Kong) and the API name is ap-east-1. The AWS Asia Pacific (Hong Kong) Region is the eighth active AWS Region in Asia Pacific and mainland China along with Beijing, Mumbai, Ningxia, Seoul, Singapore, Sydney, and, Tokyo. With this launch, AWS now spans 64 Availability Zones within 21 geographic regions around the world. We have also announced plans for 12 more Availability Zones and four more AWS Regions in Bahrain, Cape Town, Jakarta, and Milan. Instances and Services Applications running in this 3-AZ region can use C5, C5d, D2, I3, M5, M5d, R5, R5d, and T3 instances, and can make use of a long list of AWS services including Amazon API Gateway, Application Auto Scaling, AWS Certificate Manager (ACM), AWS Artifact, AWS CloudFormation, Amazon CloudFront, AWS CloudTrail, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, AWS CodeDeploy, AWS Config, AWS Config Rules, AWS Database Migration Service, AWS Direct Connect, Amazon DynamoDB, EC2 Auto Scaling, EC2 Dedicated Hosts, Amazon Elastic Block Store (EBS), Amazon Elastic Compute Cloud (EC2), Elastic Container Registry, Amazon ECS, Application Load Balancers (Classic, Network, and Application), Amazon EMR, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon Glacier, AWS Identity and Access Management (IAM), Amazon Kinesis Data Streams, AWS Key Management Service (KMS), AWS Lambda, AWS Marketplace, AWS Organizations, AWS Personal Health Dashboard, AWS Resource Groups, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Aurora, Amazon Route 53 (including Private DNS for VPCs), AWS Shield, AWS Server Migration Service, AWS Snowball, AWS Snowball Edge, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), Amazon Simple Workflow Service (SWF), AWS Step Functions, AWS Support API, Amazon EC2 Systems Manager (SSM), AWS Trusted Advisor, Amazon Virtual Private Cloud, and VM Import/Export. AWS Elastic Beanstalk, Amazon Elastic Container Service for Kubernetes, and AWS X-Ray are scheduled for deployment next month, with other services to follow. We are also working to enable cross-region delivery from SNS topics hosted in other regions to SQS queues hosted in the new region. Using the Asia Pacific (Hong Kong) Region As we announced last month, you need to explicitly enable this region for your AWS account in order to be able to create and manage resources within it. Enabling or disabling a region requires the account:EnableRegion, account:DisableRegion, and account:ListRegions permissions. Here’s a sample IAM policy that grants these permissions for the new region: { "Version":"2012-10-17", "Statement":[ { "Effect":"Allow", "Action":[ "aws-portal:ViewAccount", "account:ListRegions" ], "Resource":"*" }, { "Effect":"Allow", "Action":[ "account:EnableRegion", "account:DisableRegion" ], "Resource":"*", "Condition":{ "StringEquals":{ "account:TargetRegion":"ap-east-1" } } } ] } Log in to the AWS Management Console as a user that has these appropriate permissions and click My Account: Scroll down to the AWS Regions section, find the new region, and click Enable: Then confirm your action by clicking Enable region: The region is enabled immediately, and will be ready for use shortly thereafter. You can also enable the region by selecting it from the menu: And then confirming your action: Connectivity, Edge Locations, and Latency Hong Kong SAR is already home to three Amazon CloudFront edge locations (the first one opened way back in 2008). There are also more than thirty other edge locations and eleven regional edge caches in Asia; see the Amazon CloudFront Infrastructure page for a full list. The region offers low-latency connections to other cities and AWS regions in the area. Here are the latest numbers: There are now two Hong Kong AWS Direct Connect locations: the existing one at iAdvantage Mega-I and a new one at Equinix HK1. Both locations have direct connectivity to the Asia Pacific (Hong Kong) Region. If you already connect to AWS at iAdvantage, you can use your existing Direct Connect connection to access the new region via Direct Connect Gateway. Investing in the Future Before I wrap up I would like to tell you about some of work that we are doing to support startups and to educate developers: AWS Activate – This global program provides startups with credits, training, and support so that they can build their businesses on AWS. AWS Educate – This global program teaches students about cloud computing. It provides AWS credits to educators and students, along with discounts on training, access to curated content, personalized learning pathways, and collaboration tools. Dozens of Hong Kong universities and business schools are already participating. AWS Academy – This global program is designed to bridge the gap between academia and industry by giving students the knowledge that they need to have in order to qualify for jobs that require cloud skills. The program is built around hands-on experience, and includes an AWS-authored curriculum, access to AWS certification, accreditation for educators. Training and Certification – This global program helps developers to build cloud skills using digital or classroom training and to validate those skills by earning an industry-recognized credential. It includes learning paths for Cloud Practitioners, Architects, Developers, and Operations. — Jeff;  

Now Available – AMD EPYC-Powered Amazon EC2 T3a Instances

The AMD EPYC-powered T3a instances that I promised you last year are available now and you can start using them today! Like the recently announced M5ad and R5ad instances, the T3a instances are built on the AWS Nitro System and give you an opportunity to balance your instance mix based on cost and performance. T3a Instances These instances deliver burstable, cost-effective performance and are a great fit for workloads that do not need high sustained compute power but experience temporary spikes in usage. You get a generous and assured baseline amount of processing power and the ability to transparently scale up to full core performance when you need more processing power, for as long as necessary. To learn more about the burstable compute model common to the T3 and the T3a, read New T3 Instances – Burstable, Cost-Effective Performance. You can launch T3a instances today in seven sizes in the US East (N. Virginia), US West (Oregon), Europe (Ireland), US East (Ohio), and Asia Pacific (Singapore) Regions in On-Demand, Spot, and Reserved Instance form. Here are the specs: Instance Name vCPUs RAM EBS-Optimized Bandwidth Network Bandwidth t3a.nano 2 0.5 GiB Up to 1.5 Gbps Up to 5 Gbps t3a.micro 2 1 GiB Up to 1.5 Gbps Up to 5 Gbps t3a.small 2 2 GiB Up to 1.5 Gbps Up to 5 Gbps t3a.medium 2 4 GiB Up to 1.5 Gbps Up to 5 Gbps t3a.large 2 8 GiB Up to 2.1 Gbps Up to 5 Gbps t3a.xlarge 4 16 GiB Up to 2.1 Gbps Up to 5 Gbps t3a.2xlarge 8 32 GiB Up to 2.1 Gbps Up to 5 Gbps The T3 and the T3a instances are available in the same sizes and can use the same AMIs, making it easy for you to try both and find the one that is the best match for you application. Pricing is 10% lower than the equivalent existing T3 instances; see the On-Demand, Spot, and Reserved Instance pricing pages for more info. — Jeff;

Amazon SageMaker Ground Truth Keeps Simplifying Labeling Workflows

Launched at AWS re:Invent 2018, Amazon SageMaker Ground Truth is a capability of Amazon SageMaker that makes it easy for customers to efficiently and accurately label the datasets required for training machine learning systems. A quick recap on Amazon SageMaker Ground Truth Amazon SageMaker Ground Truth helps you build highly accurate training datasets for machine learning quickly. SageMaker Ground Truth offers easy access to public and private human labelers and provides them with built-in workflows and interfaces for common labeling tasks. Additionally, SageMaker Ground Truth can lower your labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently. Amazon SageMaker Ground Truth helps you build datasets for: Text classification. Image classification, i.e categorizing images in specific classes. Object detection, i.e. locating objects in images with bounding boxes. Semantic segmentation, i.e. locating objects in images with pixel-level precision. Custom user-defined tasks, that let customers annotate literally anything. You can choose to use your team of labelers and route labeling requests directly to them. Alternatively, if you need to scale up, options are provided directly in the Amazon SageMaker Ground Truth console to work with labelers outside of your organization. You can access a public workforce of over 500,000 labelers via integration with Amazon Mechanical Turk. Alternatively, if your data requires confidentiality or special skills, you can use professional labeling companies pre-screened by Amazon, and listed on the AWS Marketplace. Announcing new features Since the service was launched, we gathered plenty of customer feedback (keep it coming!), from companies such as T-Mobile, Pinterest, Change Healthcare, GumGum, Automagi and many more. We used it to define what the next iteration of the service would look like, and just a few weeks ago, we launched two highly requested features: Multi-category bounding boxes, allowing you to label multiple categories within an image simultaneously. Three new UI templates for your custom workflows, for a total of fifteen different templates that help you quickly build annotation workflows for images, text, and audio datasets. Today, we’re happy to announce another set of new features that keep simplifying the process of building and running cost-effective labeling workflows. Let’s look at each one of them. Job chaining Customers often want to run a subsequent labeling job using the output of a previous labeling job. Basically, they want to chain together labeling jobs using the outputted labeled dataset (and outputted ML model if automated data labeling was enabled). For example, they may run an initial job where they identify if humans exist in an image, and then they may want to run a subsequent job where they get bounding boxes drawn around the humans. If active learning was used, customers may also want to use the ML model that was produced in order to bootstrap automated data labeling in a subsequent job. Setup couldn’t be easier: you can chain labeling jobs with just one click! Job tracking Customers want to be able to see the status of the progress of their labeling jobs. We now provide near real-time status for labeling jobs. Long-lived jobs Many customers use experts as labelers, and these individuals perform labeling on a periodic basis. For example, healthcare companies often use clinicians as their expert labelers, and they can only perform labeling occasionally during downtime. In these scenarios, labeling jobs need to run longer, sometimes for weeks or months. We now support extended task timeout windows where each batch of a labeling job can run for 10 days, meaning labeling jobs can extend for months. Dynamic custom workflows When setting up custom workflows, customers want to insert or use additional context in addition to the source data. For example, a customer may want to display the specific weather conditions above each image in the tasks they send to labelers; this information can help labelers better perform the task at-hand. Specifically, this feature allows customers to inject output from previous labeling jobs or other custom content into the custom workflow. This information is passed into a pre-processing Lambda function using the augmented manifest file that includes the source data and additional context. The customer can also use the additional context to dynamically adjust the workflow. New service providers and new languages We are listing two new data labeling service providers onto the AWS Marketplace: Vivetic and SmartOne. With the addition of these two vendors, Amazon SageMaker Ground Truth will add support for data labeling in French, German, and Spanish. Regional expansion In addition to US-East (Virginia), US-Central (Ohio), US-West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo), Amazon SageMaker Ground Truth is now available in Asia Pacific (Sydney). Customer case study: ZipRecruiter ZipRecruiter is helping people find great jobs, and helping employers build great companies. They’ve been using Amazon SageMaker since launch. Says ZipRecruiter CTO Craig Ogg: “ZipRecruiter’s AI-powered algorithm learns what each employer is looking for and provides a personalized, curated set of highly relevant candidates. On the other side of the marketplace, the company’s technology matches job seekers with the most pertinent jobs. And to do all that efficiently, we needed a Machine Learning model to extract relevant data automatically from uploaded resumes”. Of course, building datasets is a critical part of the machine learning process, and it’s often expensive and extremely time-consuming. To solve both problems, ZipRecruiter turned to Ground Truth and one of our labeling partners, iMerit. As Craig puts it: “Amazon SageMaker Ground Truth will significantly help us reduce the time and effort required to create datasets for training. Due to the confidential nature of the data, we initially considered using one of our teams but it would take time away from their regular tasks and it would take months to collect the data we needed. Using Amazon SageMaker Ground Truth, we engaged iMerit, a professional labeling company that has been pre-screened by Amazon, to assist with the custom annotation project. With their assistance we were able to collect thousands of annotations in a fraction of the time it would have taken using our own team.” Getting started I hope that this post was informative, and that the new features will let you build even faster. Please try Amazon SageMaker Ground Truth, let us know what you think, and help us build the next iteration of this cool service! Julien

New – Query for AWS Regions, Endpoints, and More Using AWS Systems Manager Parameter Store

In response to requests from AWS customers, I have been asking our service teams to find ways to make information about our regions and services available programmatically. Today I am happy to announce that this information is available in the AWS Systems Manager Parameter Store, and that you can easily access it from your scripts and your code. You can get a full list of active regions, find out which services are available with them, and much more. Running Queries I’ll use the AWS Command Line Interface (CLI) for most of my examples; you can also use the AWS Tools for Windows PowerShell or any of the AWS SDKs. As is the case with all of the CLI commands, you can request output in JSON, tab-delimited text, or table format. I’ll use JSON, and will make liberal use of the jq utility to show the more relevant part of the output from each query. Here’s how to query for the list of active regions: $ aws ssm get-parameters-by-path \ --path /aws/service/global-infrastructure/regions --output json | \ jq .Parameters[].Name "/aws/service/global-infrastructure/regions/ap-northeast-1" "/aws/service/global-infrastructure/regions/eu-central-1" "/aws/service/global-infrastructure/regions/eu-north-1" "/aws/service/global-infrastructure/regions/eu-west-1" "/aws/service/global-infrastructure/regions/eu-west-3" "/aws/service/global-infrastructure/regions/sa-east-1" "/aws/service/global-infrastructure/regions/us-east-2" "/aws/service/global-infrastructure/regions/us-gov-east-1" "/aws/service/global-infrastructure/regions/us-gov-west-1" "/aws/service/global-infrastructure/regions/us-west-1" "/aws/service/global-infrastructure/regions/ap-northeast-2" "/aws/service/global-infrastructure/regions/ap-northeast-3" "/aws/service/global-infrastructure/regions/ap-south-1" "/aws/service/global-infrastructure/regions/ap-southeast-1" "/aws/service/global-infrastructure/regions/ap-southeast-2" "/aws/service/global-infrastructure/regions/ca-central-1" "/aws/service/global-infrastructure/regions/cn-north-1" "/aws/service/global-infrastructure/regions/cn-northwest-1" "/aws/service/global-infrastructure/regions/eu-west-2" "/aws/service/global-infrastructure/regions/us-west-2" "/aws/service/global-infrastructure/regions/us-east-1" Here’s how to display a complete list of all available AWS services, sort them into alphabetical order, and display the first 10 (out of 155, as I write this): $ aws ssm get-parameters-by-path \ --path /aws/service/global-infrastructure/services --output json | \ jq .Parameters[].Name | sort | head -10 "/aws/service/global-infrastructure/services/acm" "/aws/service/global-infrastructure/services/acm-pca" "/aws/service/global-infrastructure/services/alexaforbusiness" "/aws/service/global-infrastructure/services/apigateway" "/aws/service/global-infrastructure/services/application-autoscaling" "/aws/service/global-infrastructure/services/appmesh" "/aws/service/global-infrastructure/services/appstream" "/aws/service/global-infrastructure/services/appsync" "/aws/service/global-infrastructure/services/athena" "/aws/service/global-infrastructure/services/autoscaling" Here’s how to get the list of services that are available in a given region (again, first 10, sorted): $ aws ssm get-parameters-by-path \ --path /aws/service/global-infrastructure/regions/us-east-1/services --output json | \ jq .Parameters[].Name | sort | head -10 "/aws/service/global-infrastructure/regions/us-east-1/services/acm" "/aws/service/global-infrastructure/regions/us-east-1/services/acm-pca" "/aws/service/global-infrastructure/regions/us-east-1/services/alexaforbusiness" "/aws/service/global-infrastructure/regions/us-east-1/services/apigateway" "/aws/service/global-infrastructure/regions/us-east-1/services/application-autoscaling" "/aws/service/global-infrastructure/regions/us-east-1/services/appmesh" "/aws/service/global-infrastructure/regions/us-east-1/services/appstream" "/aws/service/global-infrastructure/regions/us-east-1/services/appsync" "/aws/service/global-infrastructure/regions/us-east-1/services/athena" "/aws/service/global-infrastructure/regions/us-east-1/services/autoscaling" Here’s how to get the list of regions where a service (Amazon Athena, in this case) is available: $ aws ssm get-parameters-by-path \ --path /aws/service/global-infrastructure/services/athena/regions --output json | \ jq .Parameters[].Value "ap-northeast-2" "ap-south-1" "ap-southeast-2" "ca-central-1" "eu-central-1" "eu-west-1" "eu-west-2" "us-east-1" "us-east-2" "us-gov-west-1" "ap-northeast-1" "ap-southeast-1" "us-west-2" Here’s how to use the path to get the name of a service: $ aws ssm get-parameters-by-path \ --path /aws/service/global-infrastructure/services/athena --output json | \ jq .Parameters[].Value "Amazon Athena" And here’s how you can find the regional endpoint for a given service, again using the path: $ aws ssm get-parameter \ --name /aws/service/global-infrastructure/regions/us-west-1/services/s3/endpoint \ --output json | \ jq .Parameter.Value "s3.us-west-1.amazonaws.com" Available Now This data is available now and you can start using it today at no charge. — Jeff; PS – Special thanks to my colleagues Blake Copenhaver and Phil Cali for their help with this post!  

AWS re:Inforce 2019 – Security, Identity, and Compliance

AWS re:Inforce, our new conference dedicated to cloud security, opens in Boston on June 25th. We’re expecting about 8,000 attendees, making this bigger than the first re:Invent! Just like re:Invent, re:Inforce is a learning conference for builders. With over 300 breakout sessions (intermediate, advanced, and expert) spanning four tracks and a virtual Capture The Flag event, attendees will walk away knowing how to use cloud-based infrastructure in a secure and compliant manner. The re:Inforce agenda also includes a healthy collection of bootcamps, chalk talks, workshops, full-day hands-on labs, builder sessions, leadership sessions, and the Security Jam. Diving deeper into the session offerings, a wide range of services will be considered – including (to name a few) AWS WAF, AWS Firewall Manager, AWS KMS, AWS Secrets Manager, AWS Lambda, AWS Control Tower, Amazon SageMaker, Amazon GuardDuty, AWS CloudTrail, Amazon Macie, Amazon RDS, Amazon Aurora, AWS Identity and Access Management, Amazon EKS, and Amazon Inspector. You will have the opportunity to learn about a broad variety of important topics including building secure APIs, encryption, privileged access, auditing the cloud, open source, DevSecOps, building a security culture, hiring/staffing, and privacy by design as well as specific compliance regimes such as PCI, NIST, SOC, FedRAMP, and HIPAA. To learn more about re:Inforce, read the FAQ, check out the Venue & Hotel info, and review the Code of Conduct. Register Now & Save $100 If you register now and use code RFSAL19, you can save $100, while supplies last. — Jeff;    

Safely Validating Usernames with Amazon Cognito

Guest post by AWS Community Hero Larry Ogrodnek. Larry is an independent consultant focused on cloud architecture, DevOps, serverless, and general software development on AWS. He’s always ready to talk about AWS over coffee, and enjoys development and helping other developers. How are users identified in your system? Username? Email? Is it important that they’re unique? You may be surprised to learn, as I was, that in Amazon Cognito both usernames and emails are treated as case-sensitive. So “JohnSmith” is different than “johnsmith,” who is different than “jOhNSmiTh.” The same goes for email addresses—it’s even specified in the SMTP RFC that users “smith” and “Smith” may have different mailboxes. That is crazy! I recently added custom signup validation for Amazon Cognito. In this post, let me walk you through the implementation. The problem with uniqueness Determining uniqueness is even more difficult than just dealing with case insensitivity. Like many of you, I’ve received emails based on Internationalized Domain Name homograph attacks. A site is registered for “example.com” but with Cyrillic character “a,” attempting to impersonate a legitimate site and collect information. This same type of attack is possible for user name registration. If I don’t check for this, someone may be able to impersonate another user on my site. Do you have reservations? Does my application have user-generated messages or content? Besides dealing with uniqueness, I may want to reserve certain names. For example, if I have user-editable information at user.myapp.com or myapp.com/user, what if someone registers “signup” or “faq” or “support”? What about “billing”? It’s possible that a malicious user could impersonate my site and use it as part of an attack. Similar attacks are also possible if users have any kind of inbox or messaging. In addition to reserving usernames, I should also separate out user content to its own domain to avoid confusion. I remember GitHub reacting to something similar when it moved user pages from github.com to github.io in 2013. James Bennet wrote about these issues in great detail in his excellent post, Let’s talk about usernames. He describes the types of validation performed in his django-registration application. Integrating with Amazon Cognito Okay, so now that you know a little bit more about this issue, how do I handle this with Amazon Cognito? Well, I’m in luck, because Amazon Cognito lets me customize much of my authentication workflow with AWS Lambda triggers. To add username or email validation, I can implement a pre-sign-up Lambda trigger, which lets me perform custom validation and accept or deny the registration request. It’s important to note that I can’t modify the request. To perform any kind of case or name standardization (for example, forcing lower case), I have to do that on the client. I can only validate that it was done in my Lambda function. It would be handy if this was something available in the future. To declare a sign-up as invalid, all I have to do is return an error from the Lambda function. In Python, this is as simple as raising an exception. If my validation passes, I just return the event, which already includes the fields that I need for a generic success response. Optionally, I can auto-verify some fields. To enforce that my frontend is sending usernames standardized as lowercase, all I need is the following code: def run(event, context): user = event[‘userName’] if not user.isLower(): raise Exception(“Username must be lowercase”) return event Adding unique constraints and reservations I’ve extracted the validation checks from django-registration into a Python module named username-validator to make it easier to perform these types of uniqueness checks in Lambda: pip install username-validator In addition to detecting confusing homoglyphs, it also includes a standard set of reserved names like “www”, “admin”, “root”, “security”, “robots.txt”, and so on. You can provide your own additions for application-specific reservations, as well as perform individual checks. To add this additional validation and some custom reservations, I update the function as follows: from username_validator import UsernameValidator MY_RESERVED = [ "larry", "aws", "reinvent" ] validator = UsernameValidator(additional_names=MY_RESERVED) def run(event, context): user = event['userName'] if not user.islower(): raise Exception("Username must be lowercase") validator.validate_all(user) return event Now, if I attach that Lambda function to the Amazon Cognito user pool as a pre–sign-up trigger and try to sign up for “aws”, I get a 400 error. I also get some text that I could include in the signup form: Other attributes, including email (if used) are available under event[‘request’][‘userAttributes’]}. For example: { "request": { "userAttributes": {"name": "larry", "email": "larry@example.com" } } } What’s next? I can validate other attributes in the same way. Or, I can add other custom validation by adding additional checks and raising an exception, with a custom message if it fails. In this post, I covered why it’s important to think about identity and uniqueness, and demonstrated how to add additional validations to user signups in Amazon Cognito. Now you know more about controlling signup validation with a custom Lambda function. I encourage you to check out the other user pool workflow customizations that are possible with Lambda triggers.

The Wide World of Microsoft Windows on AWS

You have been able to run Microsoft Windows on AWS since 2008 (my ancient post, Big Day for Amazon EC2: Production, SLA, Windows, and 4 New Capabilities, shows you just how far AWS come in a little over a decade). According to IDC, AWS has nearly twice as many Windows Server instances in the cloud as the next largest cloud provider. Today, we believe that AWS is the best place to run Windows and Windows applications in the cloud. You can run the full Windows stack on AWS, including Active Directory, SQL Server, and System Center, while taking advantage of 61 Availability Zones across 20 AWS Regions. You can run existing .NET applications and you can use Visual Studio or VS Code build new, cloud-native Windows applications using the AWS SDK for .NET. Wide World of Windows Starting from this amazing diagram drawn by my colleague Jerry Hargrove, I’d like to explore the Windows-on-AWS ecosystem in detail: 1 – SQL Server Upgrades AWS provides first-class support for SQL Server, encompassing all four Editions (Express, Web, Standard, and Enterprise), with multiple version of each edition. This wide-ranging support has helped SQL Server to become one of the most popular Windows workloads on AWS. The SQL Server Upgrade Tool (an AWS Systems Manager script) makes it easy for you to upgrade an EC2 instance that is running SQL Server 2008 R2 SP3 to SQL Server 2016. The tool creates an AMI from a running instance, upgrades the AMI to SQL Server 2016, and launches the new AMI. To learn more, read about the AWSEC2-CloneInstanceAndUpgradeSQLServer action. Amazon RDS makes it easy for you to upgrade your DB Instances to new major or minor upgrades to SQL Server. The upgrade is performed in-place, and can be initiated with a couple of clicks. For example, if you are currently running SQL Server 2014, you have the following upgrades available: You can also opt-in to automatic upgrades to new minor versions that take place within your preferred maintenance window: Before you upgrade a production DB Instance, you can create a snapshot backup, use it to create a test DB Instance, upgrade that instance to the desired new version, and perform acceptance testing. To learn more, about upgrades, read Upgrading the Microsoft SQL Server DB Engine. 2 – SQL Server on Linux If your organization prefers Linux, you can run SQL Server on Ubuntu, Amazon Linux 2, or Red Hat Enterprise Linux using our License Included (LI) Amazon Machine Images. Read the most recent launch announcement or search for the AMIs in AWS Marketplace using the EC2 Launch Instance Wizard: This is a very cost-effective option since you do not need to pay for Windows licenses. You can use the new re-platforming tool (another AWS Systems Manager script) to move your existing SQL Server databases (2008 and above, either in the cloud or on-premises) from Windows to Linux. 3 – Always-On Availability Groups (Amazon RDS for SQL Server) If you are running enterprise-grade production workloads on Amazon RDS (our managed database service), you should definitely enable this feature! It enhances availability and durability by replicating your database between two AWS Availability Zones, with a primary instance in one and a hot standby in another, with fast, automatic failover in the event of planned maintenance or a service disruption. You can enable this option for an existing DB Instance, and you can also specify it when you create a new one: To learn more, read Multi-AZ Deployments Using Microsoft SQL Mirroring or Always On. 4 – Lambda Support Let’s talk about some features for developers! Launched in 2014, and the subject of continuous innovation ever since, AWS Lambda lets you run code in the cloud without having to own, manage, or even think about servers. You can choose from several .NET Core runtimes for your Lambda functions, and then write your code in either C# or PowerShell: To learn more, read Working with C# and Working with PowerShell in the AWS Lambda Developer Guide. Your code has access to the full set of AWS services, and can make use of the AWS SDK for .NET; read the Developing .NET Core AWS Lambda Functions post for more info. 5 – CDK for .NET The AWS CDK (Cloud Development Kit) for .NET lets you define your cloud infrastructure as code and then deploy it using AWS CloudFormation. For example, this code (stolen from this post) will generate a template that creates an Amazon Simple Queue Service (SQS) queue and an Amazon Simple Notification Service (SNS) topic: var queue = new Queue(this, "MyFirstQueue", new QueueProps { VisibilityTimeoutSec = 300 } var topic = new Topic(this, "MyFirstTopic", new TopicProps { DisplayName = "My First Topic Yeah" }); 6 – EC2 AMIs for .NET Core If you are building Linux applications that make use of .NET Core, you can use use our Amazon Linux 2 and Ubuntu AMIs. With .NET Core, PowerShell Core, and the AWS Command Line Interface (CLI) preinstalled, you’ll be up and running— and ready to deploy applications—in minutes. You can find the AMIs by searching for core when you launch an EC2 instance: 7 – .NET Dev Center The AWS .Net Dev Center contains materials that will help you to learn how design, build, and run .NET Applications on AWS. You’ll find articles, sample code, 10-minute tutorials, projects, and lots more: 8 – AWS License Manager We want to help you to manage and optimize your Windows and SQL Server applications in new ways. For example,  AWS License Manager helps you to manage the licenses for the software that you run in the cloud or on-premises (read my post, New AWS License Manager – Manage Software Licenses and Enforce Licensing Rules, to learn more). You can create custom rules that emulate those in your licensing agreements, and enforce them when an EC2 instance is launched: The License Manager also provides you with information on license utilization so that you can fine-tune your license portfolio, possibly saving some money in the process! 9 – Import, Export, and Migration You have lots of options and choices when it comes to moving your code and data into and out of AWS. Here’s a very brief summary: TSO Logic – This new member of the AWS family (we acquired the company earlier this year) offers an analytics solution that helps you to plan, optimize, and save money as you make your journey to the cloud. VM Import/Export – This service allows you to import existing virtual machine images to EC2 instances, and export them back to your on-premises environment. Read Importing a VM as an Image Using VM Import/Export to learn more. AWS Snowball – This service lets you move petabyte scale data sets into and out of AWS. If you are at exabyte scale, check out the AWS Snowmobile. AWS Migration Acceleration Program – This program encompasses AWS Professional Services and teams from our partners. It is based on a three step migration model that includes a readiness assessment, a planning phase, and the actual migration. 10 – 21st Century Applications AWS gives you a full-featured, rock-solid foundation and a rich set of services so that you can build tomorrow’s applications today! You can go serverless with the .NET Core support in Lambda, make use of our Deep Learning AMIs for Windows, host containerized apps on Amazon ECS or eks], and write code that makes use of the latest AI-powered services. Your applications can make use of recommendations, forecasting, image analysis, video analysis, text analytics, document analysis, text to speech, translation, transcription, and more. 11 – AWS Integration Your existing Windows Applications, both cloud-based and on-premises, can make use of Windows file system and directory services within AWS: Amazon FSx for Windows Server – This fully managed native Windows file system is compatible with the SMB protocol and NTFS. It provides shared file storage for Windows applications, backed by SSD storage for fast & reliable performance. To learn more, read my blog post. AWS Directory Service – Your directory-aware workloads and AWS Enterprise IT applications can use this managed Active Directory that runs in the AWS Cloud. Join our Team If you would like to build, manage, or market new AWS offerings for the Windows market, be sure to check out our current openings. Here’s a sampling: Senior Digital Campaign Marketing Manager – Own the digital tactics for product awareness and run adoption campaigns. Senior Product Marketing Manager – Drive communications and marketing, create compelling content, and build awareness. Developer Advocate – Drive adoption and community engagement for SQL Server on EC2. Learn More Our freshly updated Windows on AWS and SQL Server on AWS pages contain case studies, quick starts, and lots of other useful information. — Jeff;

Docker, Amazon ECS, and Spot Fleets: A Great Fit Together

Guest post by AWS Container Hero Tung Nguyen. Tung is the president and founder of BoltOps, a consulting company focused on cloud infrastructure and software on AWS. He also enjoys writing for the BoltOps Nuts and Bolts blog. EC2 Spot Instances allow me to use spare compute capacity at a steep discount. Using Amazon ECS with Spot Instances is probably one of the best ways to run my workloads on AWS. By using Spot Instances, I can save 50–90% on Amazon EC2 instances. You would think that folks would jump at a huge opportunity like a black Friday sales special. However, most folks either seem to not know about Spot Instances or are hesitant. This may be due to some fallacies about Spot. Spot Fallacies With the Spot model, AWS can remove instances at any time. It can be due to a maintenance upgrade; high demand for that instance type; older instance type; or for any reason whatsoever. Hence the first fear and fallacy that people quickly point out with Spot: What do you mean that the instance can be replaced at any time? Oh no, that must mean that within 20 minutes of launching the instance, it gets killed. I felt the same way too initially. The actual Spot Instance Advisor website states: The average frequency of interruption across all Regions and instance types is less than 5%. From my own usage, I have seen instances run for weeks. Need proof? Here’s a screenshot from an instance in one of our production clusters. If you’re wondering how many days that is…. Yes, that is 228 continuous days. You might not get these same long uptimes, but it disproves the fallacy that Spot Instances are usually interrupted within 20 minutes from launch. Spot Fleets With Spot Instances, I place a single request for a specific instance in a specific Availability Zone. With Spot Fleets, instead of requesting a single instance type, I can ask for a variety of instance types that meet my requirements. For many workloads, as long as the CPU and RAM are close enough, many instance types do just fine. So, I can spread my instance bets across instance types and multiple zones with Spot Fleets. Using Spot Fleets dramatically makes the system more robust on top of the already mentioned low interruption rate. Also, I can run an On-Demand cluster to provide additional safeguard capacity. ECS and Spot Fleets: A Great Fit Together This is one of my favorite ways to run workloads because it gives me a scalable system at a ridiculously low cost. The technologies are such a great fit together that one might think they were built for each other. Docker provides a consistent, standard binary format to deploy. If it works in one Docker environment, then it works in another. Containers can be pulled down in seconds, making them an excellent fit for Spot Instances, where containers might move around during an interruption. ECS provides a great ecosystem to run Docker containers. ECS supports a feature called connection instance draining that allows me to tell ECS to relocate the Docker containers to other EC2 instances. Spot Instances fire off a two-minute warning signal letting me know when it’s about to terminate the instance. These are the necessary pieces I need for building an ECS cluster on top of Spot Fleet. I use the two-minute warning to call ECS connection draining, and ECS automatically moves containers to another instance in the fleet. Here’s a CloudFormation template that achieves this: ecs-ec2-spot-fleet. Because the focus is on understanding Spot Fleets, the VPC is designed to be simple. The template specifies two instance types in the Spot Fleet: t3.small and t3.medium with 2 GB and 4 GB of RAM, respectively. The template weights the t3.medium twice as much as the t3.small. Essentially, the Spot Fleet TargetCapacity value equals the total RAM to provision for the ECS cluster. So if I specify 8, the Spot Fleet service might provision four t3.small instances or two t3.medium instances. The cluster adds up to at least 8 GB of RAM. To launch the stack run, I run the following command: aws cloudformation create-stack --stack-name ecs-spot-demo --template-body file://ecs-spot-demo.yml --capabilities CAPABILITY_IAM The CloudFormation stack launches container instances and registers them to an ECS cluster named developmentby default. I can change this with the EcsCluster parameter. For more information on the parameters, see the README and the template source. When I deploy the application, the deploy tool creates the ECS cluster itself. Here are the Spot Instances in the EC2 console. Deploy the demo app After the Spot cluster is up, I can deploy a demo app on it. I wrote a tool called Ufo that is useful for these tasks: Build the Docker image. Register the ECS task definition. Register and deploy the ECS service. Create the load balancer. Docker should be installed as a prerequisite. First, I create an ECR repo and set some variables: ECR_REPO=$(aws ecr create-repository --repository-name demo/sinatra | jq -r '.repository.repositoryUri') VPC_ID=$(aws ec2 describe-vpcs --filters Name=tag:Name,Values="demo vpc" | jq -r '.Vpcs[].VpcId') Now I’m ready to clone the demo repo and deploy a sample app to ECS with ufo. git clone https://github.com/tongueroo/demo-ufo.git demo cd demo ufo init --image $ECR_REPO --vpc-id $VPC_ID ufo current --service demo-web ufo ship # deploys to ECS on the Spot Fleet cluster Here’s the ECS service running: I then grab the Elastic Load Balancing endpoint from the console or with ufo ps. $ ufo ps Elb: develop-Elb-12LHJWU4TH3Q8-597605736.us-west-2.elb.amazonaws.com $ Now I test with curl: $ curl develop-Elb-12LHJWU4TH3Q8-597605736.us-west-2.elb.amazonaws.com 42 The application returns “42,” the meaning of life, successfully. That’s it! I now have an application running on ECS with Spot Fleet Instances. Parting thoughts One additional advantage of using Spot is that it encourages me to think about my architecture in a highly available manner. The Spot “constraints” ironically result in much better sleep at night as the system must be designed to be self-healing. Hopefully, this post opens the world of running ECS on Spot Instances to you. It’s a core of part of the systems that BoltOps has been running on its own production system and for customers. I still get excited about the setup today. If you’re interested in Spot architectures, contact me at BoltOps. One last note: Auto Scaling groups also support running multiple instance types and purchase options. Jeff mentions in his post that weight support is planned for a future release. That’s exciting, as it may streamline the usage of Spot with ECS even further.

In the Works – AWS Region in Indonesia

Last year we launched two new AWS Regions—a second GovCloud Region in the United States, and our first Nordic Region in Sweden—and we announced that we are working on regions in Cape Town, South Africa and Milan, Italy. Jakarta in the Future Today, I am happy to announce that we are working on the AWS Asia Pacific (Jakarta) Region in Indonesia. The new region will be based in Greater Jakarta, will be comprised of three Availability Zones, and will give AWS customers and partners the ability to run their workloads and store their data in Indonesia. The AWS Asia Pacific (Jakarta) Region will be our ninth region in Asia Pacific, joining existing regions in Beijing, Mumbai, Ningxia, Seoul, Singapore, Sydney, Tokyo, and an upcoming region in Hong Kong SAR. AWS customers are already making use of 61 Availability Zones across 20 infrastructure regions worldwide. Today’s announcement brings the total number of global regions (operational and in the works) up to 25. We are looking forward to serving new and existing customers in Indonesia and working with partners across Asia Pacific. The addition of the AWS Asia Pacific (Jakarta) Region will enable more Indonesian organizations to leverage advanced technologies such as Analytics, Artificial Intelligence, Database, Internet of Things (IoT), Machine Learning, Mobile services, and more to drive innovation. Of course, the new region will also be open to existing AWS customers who would like to process and store data in Indonesia. We are already working to help prepare developers in Indonesia for the digital future, with programs like AWS Educate, and AWS Activate. Dozens of universities and business schools across Indonesia are already participating in our educational programs, as are a plethora of startups and accelerators. Stay Tuned I’ll be sure to share additional news about this and other upcoming AWS regions as soon as I have it, so stay tuned! — Jeff;

Pages

Recommended Content