8000 GitHub - TianqGuo/ML_Infra_POC
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

TianqGuo/ML_Infra_POC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Some basic PoC about the ML infra pipelines and related settings

PoC1

1. The pipelines contains AWS glue, AWS lambda, AWS S3 and they are integrated by AWS step functions.

2. Please note the suitable IAM roles/users need to be setup for various groups or purposes.

3. For the scalable and expandable services, components configuration need to be set correctly like S3 folder names to separate different trigger/policies.

4. State machine work flow is defined in step function file.

image

PoC2

1. The only differnce between this one and PoC1 is the feature store loading step.

2. Please note that the feature store loading step will require two layers as shown in the code, otherwise the workflow will fail.

image

PoC3

1. This is the high level design to handle the streaming data and store the features for both stream data features and batch data aggregation features.

2. This is only for PoC, some details like event notifications, IAM, error handling and related configurations are not included in the code.

image

PoC4

1. This is a simplified process for the infra change and the model/api changes workflow.

2. Please note in the production there are much more elements need to be considered like different environments (DEVX, UATX, PRDX), also various tags, dependencies, input parameters, KMS...

3. Docker image build process can be integrated to the Jenkins or Circle CI instead of the AWS codecommit in this example.

4. Model size and performance related metrics need to be properly considered.

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0