Guangning Yu's Blog

Notes | Professional Scrum Product Owner (PSPO) II Certification

2023-10-31 14:22:25

Notes

1. Introduction

  • About the exam
    • $250 USD per attempt
    • Passing score: 85%
    • Time limit: 60 minutes
    • Number of Questions: 40 (partial credit provided on some questions)
    • Format: Multiple Choice, Multiple Answer
    • Lifetime certification - no annual renewal fee required
    • Recommended courses:
      • Professional Scrum Product Owner
      • Professional Scrum Product Owner - Advanced
    • Practice assessments:
      • Product Owner Open
      • Evidence-Based Management Open
      • Scrum Open

2. Scrum Team

  • The Scrum Team consists of
    one Scrum Master, one Product Owner, and Developers.
  • The team should have all the skills needed to create the product. Having to rely on others outside of the team means dependencies on people outside of the team. This will result in delays. A knowledg

A comparison of Power BI and native web tools

2023-02-09 11:01:46  |  PowerBI
Type Pros Cons
Native Web Tool 1.Widely used and supported by a large community of developers.
2.Highly customizable and flexible, allowing for a wide range of solutions to be developed.
3.Can be integrated with a variety of databases and APIs.
1.Steep learning curve, as JavaScript can be complex and requires a good understanding of programming concepts.
2.Can be difficult to maintain and debug, especially for large and complex applications.
3.Can be slow in older browsers and less optimized devices.
Power BI 1.User-friendly interface, making it easy for non-technical users to create and customize reports.
2.Offers a range of built-in visualization options and tools for data analysis.
3.Integrates well with other Microsoft/Azure tools, such as Excel and SharePoint.
1.Limited customization options compared to JavaScript.
2.Can be less flexible and may not be suitable for more complex reporting requirements.
3.

关于ChatGPT

2022-12-09 13:00:17
  • ChatGPT 来自 OpenAI 研究实验室,由 GPT-3.5 系列模型提供支持,包括 3.5 之前的模型版本,都使用 Azure AI 超级计算基础结构上的文本和代码数据进行训练。
  • GPT-3.5 系列模型最重要的变化,是建立在人类真实反馈基础上的调校。这是一种新使用的 AI 训练方法,标记者会在模型中书写期待的回复,按照期待的回复为标记的答案排序,通过排序来奖励模型。在持续迭代的过程中,输入奖励模型,得到优化参数。

  • 训练步骤
    title

Azure Certification Guide

2022-12-05 12:44:39  |  Azure

Notes | AZ-500 Microsoft Azure Security Technologies

2022-12-05 14:56:13  |  Azure

The Azure AD Identity Platform

title

title

title

title

Managing Azure AD Tenant and Azure Subscription Associations

title

title

title

title

title

Azure AD Identities

title

title

Azure AD Groups

title

title

title

title

Azure AD Dynamic Groups

title

title

title

title

Azure AD Administrative Units

title

title

title

title

Hybrid Identity

Azure AD Connect and Hybrid Identities

title

title

title

title

title

title

title

title

Azure AD External Identities

title

title

title

title

title

title

User Flows with External Identities

title

title

Controlling Access

Securing Azure and Azure Active Directory (AD)

title

title

Azure Role-Based Access Control (RBAC)

title

title

title

title

title

title

title

Azure AD Roles

title

title

title

title

title

title

Custom Roles

title

title

title

title

title

title

title

Securing Identities and Access

title

Azure AD Privileged Identity Management (PIM)

title

title

title

title

title

title

title

title

Access Reviews

title

title

title

title

title

title

title

title

title

Azure AD Identity Protection

title

title

title

title

title

title

title

title

Conditional Access

title

title

title

title

title

Azure AD Passwordless Authentication

title

title

title

Securing Virtual Networks

Virtual Network Routing

title

title

title

title

title

Network S

Setup OpenVPN Command Line on Ubuntu

2022-04-27 12:57:06
  1. sudo apt-get -y install openvpn
  1. sudo touch /etc/openvpn/credentials
  2. sudo printf '%s\n' 'username' 'password' > /etc/openvpn/credentials
  1. sudo sed -i 's/auth-user-pass/auth-user-pass \/etc\/openvpn\/credentials/g' /etc/openvpn/US-East.ovpn
  1. sudo openvpn --config /etc/openvpn/US-East.ovpn

Reference:
How to Setup OpenVPN Command Line on Linux (Ubuntu)

Power BI Performance Tuning

2022-04-14 11:51:40

Best Practices

Import

  1. Turn on Query Caching for imported tables​
  2. Turn on Large Dataset Storage format​

DirectQuery

  1. Use star-schema for data model​and ensure that your dimension tables contain proper keys and that those keys relate to a fact table​
  2. Materialize all aggregations, transformations, and calculations in SQL Server​
  3. Pre-filter large fact tables in SQL Server​and remove filters in visual
    • Adding filters will cause Power BI to generate multiple SQL queries
  4. Use dynamic M query parameter to move measure logic from Power BI to SQL Server​
  5. Optimize measure definition to generate efficient DAX queries​
    • Power BI will translate complex DAX into multiple SQL Server queries
  6. Use dual mode for dim tables​
    • Otherwise Power BI will use import dim tables to generate long SQL Queries for filters
  7. Refine table structure and indexing strategy​
    • Clustered rowstore (like by TimeId) + non-clustered columnstore is recommended
  8. Only

Notes | AZ-900 Microsoft Azure Fundamentals

2021-08-09 17:03:39  |  Azure

Compute

Virtual Machine

title
title
title
title

Azure Functions

title

App Services

title

Azure Container Instances (ACI)

title

Azure Kubernetes Service (AKS)

title

Azure Container Registry (ACR)

title

Windows Virtual Desktop

title

Networking

Virtual Network (VNet)

title
title
title

Network Security Group (NSG)

title
title

Application Security Group

title

Load Balancer

title

VPN Gateway

title

Application Gateway

title
title

ExpressRoute

title

Paired Region

title

Storage

Storage Account

title
title
title

Blob

title
title

Disk

title
title

File

title

Archive

title

Database

title

Cosmos DB

title
title
title
title
title

Azure SQL

title
title

Azure Database for PostgreSQL

title

Database Migration Services

title

Identity and Access Management (IAM)

title

title

Azure Active Directory (AAD)

title
title
title
title
title
title

title
title
title
title

title

Group

title
title
title
title

Role

title
title
title
title

Scope

title

Role-Based Access Control (RBAC)

title
title
title
title
title
title
title
title

Azure AD Join

title
title
title
title

Azure AD Connect

title
title

Single Sign-On

title

title

Self-Service Password Reset (SSPR)

title

I

AWS Solution Architect Professional Notes

2021-06-04 16:30:48

S3

  • Maximum object size is 5TB; largest object in a single PUT is 5GB.
  • Recommended to use multi-part uploads if larger than 100MB.
  • Security: IAM policies -> Bucket policy -> Object ACL
  • title
  • Versioning cannot be enabled at the object level. It's a bucket-level feature.

Amazon Neptune

  • title

Amazon Redshift

  • title
  • title
  • title
  • title
  • title
  • title

Amazon Athena

  • title
  • title

Amazon Quantum Ledger Database

  • title

Amazon Managed Blockchain

  • title

Amazon Timestream Database

  • title

Amazon DocumentDB

  • title

Amazon ElasticSearch

  • title

Databases

  • title

S3 Select vs Athena vs Redshift Spectrum

S3 Select is focused on retrieving data from S3 using SQL:

S3 Select, enables applications to retrieve only a subset of data from an object by using simple SQL expressions. By using S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increases – in many cases you can get as much as a 400% improvement compared with classic S3

SQL Server Queries

2020-11-18 14:26:27  |  sql_server

Modify column type

  1. ALTER TABLE ZAWTH_Raw_Core_Branch.raw.POS_TRANSACTION ALTER COLUMN [transaction_line_id] VARCHAR(128) NULL;

Check currently running queries

reference: https://stackoverflow.com/a/29400789

  1. SELECT SPID = er.session_id
  2. ,STATUS = ses.STATUS
  3. ,[Login] = ses.login_name
  4. ,Host = ses.host_name
  5. ,BlkBy = er.blocking_session_id
  6. ,DBName = DB_Name(er.database_id)
  7. ,CommandType = er.command
  8. ,ObjectName = OBJECT_NAME(st.objectid)
  9. ,CPUTime = er.cpu_time
  10. ,StartTime = er.start_time
  11. ,TimeElapsed = CAST(GETDATE() - er.start_time AS TIME)
  12. ,SQLStatement = st.text
  13. FROM sys.dm_exec_requests er
  14. OUTER APPLY sys.dm_exec_sql_text(er.sql_handle) st
  15. LEFT JOIN sys.dm_exec_sessions ses
  16. ON ses.session_id = er.session_id
  17. LEFT JOIN sys.dm_exec_connections con
  18. ON con.session_id = ses.session_id
  19. WHERE st.text IS NOT NULL

Mount the Amazon EFS File System on the EC2 Instance

2020-08-08 09:53:34
  1. Create EFS on AWS web portal

  2. Edit the security group of EFS to allow access from EC2 instances
    title

  3. Mount EFS on EC2

    1. sudo mkdir efs

    title

    1. sudo chmod 777 /efs
  4. Install amazon-efs-utils for auto-remount

    1. git clone https://github.com/aws/efs-utils
    2. cd efs-utils/
    3. ./build-deb.sh
    4. sudo apt-get -y install ./build/amazon-efs-utils*deb
  5. Configure IAM role in EC2 (already done)

  6. Edit /etc/fstab

    1. fs-xxxxxxxx:/ /efs efs _netdev,tls,iam 0 0
  7. Test mount

    1. sudo mount -fav
  8. Add Linux user in the other EC2's group to avoid readonly issue

    1. sudo usermod -a -G ubuntu guangningyu
    1. sudo usermod -a -G guangningyu ubuntu

Reference:
1. Mount the Amazon EFS File System on the EC2 Instance and Test
2. Mounting your Amazon EFS file system automatically
3. User and Group ID Permissions for Files and Directories Within a File System

Test PySpark max()/min() function

2020-07-13 09:22:44  |  Spark

test.csv

  1. key,a,b,c
  2. a,1,,-1
  3. a,2,,
  4. a,3,,4

test.py

  1. from pyspark.sql import SparkSession
  2. from pyspark.sql import functions as F
  3. spark = SparkSession \
  4. .builder \
  5. .appName("spark-app") \
  6. .getOrCreate()
  7. spark.sparkContext.setLogLevel("WARN")
  8. df = spark.read.csv("test.csv", header=True)
  9. res = df.groupBy(["key"]).agg(*[
  10. F.max("a"),
  11. F.max("b"),
  12. F.max("c"),
  13. F.min("a"),
  14. F.min("b"),
  15. F.min("c"),
  16. ])
  17. print (res.toPandas())

spark-submit test.py

  1. key max(a) max(b) max(c) min(a) min(b) min(c)
  2. 0 a 3 None 4 1 None -1

Install Azure Cli on Mac

2020-02-25 15:13:21
  1. brew update && brew install azure-cli
  2. az login
  1. brew tap azure/functions
  2. brew install azure-functions-core-tools@2

References:
Install Azure CLI on macOS
Azure/azure-functions-core-tools

Create User in Windows Server 2016

2019-12-16 14:31:22
  1. Run [Server Manager] and Open [Tools] - [Computer Management].
  2. Right-Click [Users] under the [Local Users and Groups] on the left pane and select [New User].
  3. Input UserName and Password for a new user and click [Create] button. Other intems are optional to set.
  4. After creating normally, New user is shown on the list like follows.
  5. If you'd like to set administrative priviledge to the new user, Right-click the user and open [Properties].
  6. Move to [Member of] tab and click [Add] button.
  7. Specify [Administrators] group like follows.
  8. Make sure [Administrators] group is added on the list and click [OK] button to finish settings.

Reference: Windows Server 2016 : Initial Settings : Add Local Users

AWS Certified Solutions Architect Associate Notes

2019-12-05 17:59:42  |  AWS

Compute

EC2

  • Billing for interrupted Spot Instance
    title
    title
  • When you launch an instance from AMI, it uses either paravirtual (PV) or hardware virtual machine (HVM) virtualization. HVM virtualization uses hardware-assist technology provided by the AWS platform.
  • The information about the instance can be retrieved from:
  • The underlying Hypervisor for EC2:
    • Xen
    • Nitro
  • Standard Reserved Instances cannot be moved between regions. You can choose if a Reserved Instance applies to either a specific AZ or an entire region, but you cannot change the region.
  • About EC2 Auto Scaling
    • Can span multi-AZ
  • About Placement Group
    • Three types of Placement Groups
      • Clustered Placement Group
        • Within a single AZ
        • Used for applications that need low network latency, high network throughput, or both
        • Only certain instances can be launched into a Clustered Placement Group
        • AWS r

Mount S3 bucket on EC2 Linux Instance

2019-09-16 10:51:31  |  AWS
  1. Install dependencies

    1. sudo apt-get update
    2. sudo apt-get install automake autotools-dev fuse g++ git libcurl4-gnutls-dev libfuse-dev libssl-dev libxml2-dev make pkg-config
  2. Install s3fs

    1. git clone https://github.com/s3fs-fuse/s3fs-fuse.git
    2. cd s3fs-fuse
    3. ./autogen.sh
    4. ./configure --prefix=/usr --with-openssl
    5. make
    6. sudo make install
    7. which s3fs
  3. Config credentials

    1. echo "Your_accesskey:Your_secretkey" >> /etc/passwd-s3fs
    2. sudo chmod 640 /etc/passwd-s3fs
  4. Create mounting point

    1. mkdir /mys3bucket
    2. s3fs your_bucketname -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mys3bucket
  5. Config mount after reboot

    Add the following command in /etc/rc.local:

    1. /usr/local/bin/s3fs your_bucketname -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mys3bucket

Reference:
How to Mount S3 bucket on EC2 Linux Instance

Setup Nextcloud on Ubuntu

2019-09-09 23:06:06
  • Install Nextcloud
  1. # Install Nextcloud stack
  2. sudo snap install nextcloud
  3. # Create administrator account
  4. sudo nextcloud.manual-install <admin_username> <admin_password>
  5. # Configure trusted domains (only localhost by default)
  6. sudo nextcloud.occ config:system:get trusted_domains
  7. sudo nextcloud.occ config:system:set trusted_domains 1 --value=<dns-domain>
  8. # Set 512M as PHP memory limit
  9. sudo snap get nextcloud php.memory-limit # Should be 512M
  10. sudo snap set nextcloud php.memory-limit=512M
  11. # Set background jobs interval (e.g. checking for new emails, update RSS feeds, ...)
  12. sudo snap set nextcloud nextcloud.cron-interval=10m # Default: 15m
  • Set reverse proxy
  1. sudo snap set nextcloud ports.http=81 ports.https=444

Reference:
Nextcloud on AWS
Putting the snap behind a reverse proxy

Kubeless Basics

2019-05-14 10:09:51  |  Kubernetes

Deploy kubeless to a Kubernetes cluster

  1. $ export RELEASE=$(curl -s https://api.github.com/repos/kubeless/kubeless/releases/latest | grep tag_name | cut -d '"' -f 4)
  2. $ kubectl create ns kubeless
  3. $ kubectl create -f https://github.com/kubeless/kubeless/releases/download/$RELEASE/kubeless-$RELEASE.yaml
  1. $ kubectl get pods -n kubeless
  2. $ kubectl get deployment -n kubeless
  3. $ kubectl get customresourcedefinition

Deploy sample function

  1. def hello(event, context):
  2. print event
  3. return event['data']
  1. $ kubeless function deploy hello --runtime python2.7 \
  2. --from-file test.py \
  3. --handler test.hello
  1. $ kubectl get functions
  2. $ kubeless function ls
  1. $ kubeless function call hello --data 'Hello world!'

Windows cmd

2019-03-19 18:31:36  |  Windows
  • create a file

    1. echo This is a sample text file > sample.txt
  • delete a file

    1. del file_name
  • move a file

    1. move stats.doc c:\statistics
  • combine files

    1. copy /b file1 + file2 file3

Load Excel file into SQL Server

2019-03-07 15:57:56  |  Python
  1. import pandas as pd
  2. import pyodbc
  3. import sqlalchemy
  4. import urllib
  5. def get_sqlalchemy_engine(driver, server, uid, pwd, database):
  6. conn_str = 'DRIVER={};SERVER={};UID={};PWD={};DATABASE={}'.format(driver, server, uid, pwd, database)
  7. quoted = urllib.parse.quote_plus(conn_str)
  8. engine = sqlalchemy.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
  9. return engine
  10. if __name__ == '__main__':
  11. # create engine
  12. driver = 'ODBC Driver 17 for SQL Server'
  13. server = 'xxx'
  14. uid = 'xxx'
  15. pwd = 'xxx'
  16. database = 'xxx'
  17. engine = get_sqlalchemy_engine(driver, server, uid, pwd, database)
  18. # read excel
  19. file_path = 'xxx'
  20. df = pd.read_excel(file_path)
  21. # load into SQL Server
  22. schema_name = 'xxx'
  23. table_name = 'xxx'
  24. df.to_sql(table_name, schema=schema_name, con=engine, index=False, if_exists='replace')