CS/클라우드컴퓨팅

Lec 07: Database

호프 2023. 10. 22. 01:23

Introduction to Database

Flat File Approach

  • Store database as CSV files
    • use a separate file per entity, application must parse the files

Challenges

  • Data integrity
  • implementation
  • Durability

 

Basic of Database

Databases give access to the data while keeping the integrity of the data in a secure environment.

 

Data Model

  • logical structure of database
  • data model is influenced by the structure of data

Data Structure

  • Structured Data: formatted data so that can be made in related tables. able to used in highly complex queries
  • Unstructured Data: lacks any predefined structure and requires special toos to query the data. stored as file
  • Semi-structured Data: structure is not strict and can be changed -> highly flexible. .json, .xml

Schema

  • outlines the relationships within the database and the constraints of the database

Read/Write

  • Read: accessing the data
  • write: putting new data into the database or changing data

Input/output operations per second (IOPS)

  • measure of performance of reads & writes to a storage like a database
  • Databases are IOPS intensive

Index

  • physically group records into a predictable order based on the key values -> quickly find the data

SQL query Language (SQL)

  • standard language working with relational databases

 

Relational Database vs Non-Relational Database

Relational Database

  • for structured data
  • set of tables with rows and columns
    • row: record
      • each row has an identifier column
      • foreign key
    • column: attribute

Benefits of Relational Database

  • Ease of use
  • Data integrity contorls and accuracy
  • Common shared query language
  • Excellence at reducing redundancy and overall data storage

Non-Relational Database

  • for unstructured and semi-structured data
  • stores data in unstructured ways using of many storage models, including key-value pairs, documents, and graphs
    • schemas are dynamic
    • scale out horizontally

Benefits of Non-Relational Database

  • Flexibility
  • Scalability
  • High performance
  • Highly functional APIs
  • Purpose-built for customer, customize

Introduction to Amazon RDS

Database Deployments and Management

  • On-premises database: customer is responsible for everything
  • Hosted on Amazon EC2: AWS is responsible under OS
  • AWS managed: AWS is responsibility everything excepting application optimizing

Amazon Relational Database Service (Amazon RDS)

AWS RDS is a core database service that provides cost-efficient and resizable capacity while automating administration tasks

  • fully managed solution that supports multiple database engines
  • auto scailing possible demands on workloads

Benefits of Amazon RDS

  • Easy to administer: fully managed
  • Available and durable
  • Highly scalable
  • Fast
  • Secure
  • Inexpensive

 

High Availability of RDS

Multi-AZ Deployment
RDS creats a secondary copy of your database in another AZ when you enable Amazon RDS Multi-AZ

  • Primary RDS and Standby RDS

Failover scenario for Multi-AZ

  1. RDS recognizes the primary instance has failed
  2. DNS record is updated
  3. Secondary takes over operations as the primary instance (standby)
  4. in Multi-AZ configurations, a new secondary is automatically rebuilt

Read Replica: special type of database instance

  • updates made to the source database are asynchronously copied to the read replica using logical replication
  • reduce the load by routing read queries to the read replica

Multi-AZ vs Read Replicas

  Multi-AZ deployments Read replicas
Replication Synchronous block-level replication
- highly durable
Asynchronous replication
- highly scalable
Instance availability Only database engine on primary instance is active All read replicas are accessible and can be used for r ead scaling
Backups Automated backups are taken from standby No backups configured by default
Regions and AZs Always span two AZs within a Region Can be within an AZ, cross-AZ, or cross Region
Upgrades Database engine version upgrades happen on primary Database engine version upgrade is independent from source instace
Failover/promotion Automatic failover to standby when a problem is detected Can be manually promoted to a standalone database instance

 

Backup

Automatic Backups

  • enabled by default
  • RDS back up entire database by creating a storage volume snapshots of database instance
  • 30-minute backup time is recommended to avoid latency issue
  • backups are kept up to 35 days
  • restore DB instance to any specific time

Manual Database Snapshots

  • snapshots of your DB instance
  • Longer than 35 days
  • Saved in Amazon S3
  • Available until you delete them

 

Amazon RDS Costs

Cost money

  • Pay for RDS using on-demand or reserved instances
  • Billing starts as soon as the DB instance is a vailable
  • per-hour basis
  • you must stop or delete it to avoid being billed for additional DB instance hours
    • while your DB instance is stopped, you can be charged for provisioned storage and backup storage

AWS free tier

  • 750 hours per month

Setting up Amazon RDS

Amazon RDS Creation

Database creation method

  • Standard create: Availability, security, backups, maintenance
  • Easy create: Specify only the DB engine type, DB instance size, DB instance identifier

Engine Type

Templates

Deployment Option

  • Multi-AZ DB Cluster: primary DB instance and two readable standby DB instance
  • Multi-AZ DB instance: primary DB instance and a standby DB instance (not readable)
  • Single DB instalce : free-tier

DB Cluster Identifier

  • must be unique across all the DB clusters or instances in current Region in your account.

DB Instance Class

  • Standard class :general purpose
  • Memory optimized: memory-intensive applications
  • Burstable performance: ability to burst to full CPU usage

Storage Type

  • for most DB engines, RDS DB instance use Amazon EBS volumes
    • RDS automatically stripes across multiple Amazon eBS volumes to enhance performance
  • General purpose
  • Provisioned IOPS
  • Magnetic

 

Connectivity Option

Compute Resource

  • whethere connect to an EC2 compute resource or not

VPC

  • choose an existing VPC or create a new VPC
  • DB must be within a VPC

Security Group

  • DB subnet group defines which subnets that the DB cluster can use in the selected VPC
  • Amazon RDS requires two subnets in two different AZs for high availability
  • Public Access: no means that RDS won't assign public IP address
  • For security, keep the DB private and make sure that it isn't accessible from the internet

Using RDS

Connecting to the Database

Copy endpoint and port number in any standard SSQL client application to connect to a database on the DB instance.