Using Python GPT and TiDB to Analyze Stocks

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 用python gpt tidb 来分析股票

| username: tidb狂热爱好者

GPT wrote code to read NASDAQ 100 IDs and store them in TiDB

import yfinance as yf
import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

# Define the list of stock codes for the top 100 companies on NASDAQ
nasdaq_top_100 = [ ] # Please complete the list
import requests
from bs4 import BeautifulSoup

# Set request header information
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"
}

# Send GET request
res = requests.get("[https://api.nasdaq.com/api/quote/list-type/nasdaq100",](https://api.nasdaq.com/api/quote/list-type/nasdaq100%22,) headers=headers)

# Parse JSON data
main_data = res.json()['data']['data']['rows']

# Print company names
for i in range(len(main_data)):
print(main_data[i]['symbol'])
nasdaq_top_100.append(main_data[i]['symbol'])

# Create an empty list to store each company's DataFrame
dataframes = []

# Iterate through the list of stock codes
for ticker in nasdaq_top_100:
# Get stock data
stock = yf.Ticker(ticker)

# Get financial statement data
balance_sheet = stock.balance_sheet
income_statement = stock.financials

# Calculate ROE
net_income = income_statement.loc['Net Income']
shareholder_equity = balance_sheet.loc['Stockholders Equity']
roe = net_income / shareholder_equity

# Create a DataFrame to store the current company's ROE
df = pd.DataFrame({'Ticker': [ticker], 'ROE': [roe]})

# Add the DataFrame to the list
dataframes.append(df)

# Use pd.concat to merge all companies' DataFrames
roe_df = pd.concat(dataframes)

# Output the result
print(roe_df)
#url = 'mysql+pymysql://username:password@hostname:port/dbname?charset=utf8'
#engine = sa.create_engine(url, echo=False)
engine = create_engine('mysql+pymysql://username:password@ip/test')
roe_df.to_sql('df', engine, index=False,
method = "multi",chunksize = 10000 ,if_exists='replace')

End of the article
If you don’t understand, just ask GPT

I only knew before that Microsoft and Nvidia had high profits. With this code, I found that the profits of Nasdaq’s biotechnology are also very high. The profit margin is as terrifying as Apple. A search for biomedicine no1
Data analysis does allow us to achieve better returns.
Writing this code was quite difficult. But with the advent of GPT, writing code has become incredibly simple.

Thus, the seven giants of Nasdaq, known as the “Magnificent Seven” in the market, are the seven major US growth or technology stocks—Apple, Microsoft, Alphabet, Amazon, Nvidia, Tesla, and Meta Platforms. Biomedicine must also be included.

Subsequently, import the Nasdaq index K-line into TiDB.
Write your own investment amount to calculate the rate of return yourself.
Think of other methods later. When you open an account abroad, you won’t be exploited by domestic funds for management fees. A management fee of 2% is quite outrageous.

import yfinance as yf
import pymysql
from datetime import datetime, timedelta
# Define database connection parameters
db_params = {
'host': 'ip',
'user': 'username',
'password': 'password',
'db': 'test'
}

# Connect to the MySQL database
connection = pymysql.connect(**db_params)
cursor = connection.cursor()

# Create a new table for NASDAQ 100 data
create_table_query = """
CREATE TABLE IF NOT EXISTS nasdaq_100s (
date DATE,
open FLOAT,
high FLOAT,
low FLOAT,
close FLOAT,
adj_close FLOAT,
volume BIGINT
)
"""
cursor.execute(create_table_query)

nasdaq_top_100 = [ ] # Please complete the list
import requests
from bs4 import BeautifulSoup

# Set request header information
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"
}

# Send GET request
res = requests.get("[https://api.nasdaq.com/api/quote/list-type/nasdaq100",](https://api.nasdaq.com/api/quote/list-type/nasdaq100%22,) headers=headers)

# Parse JSON data
main_data = res.json()['data']['data']['rows']

# Print company names
for i in range(len(main_data)):
print(main_data[i]['symbol'])
nasdaq_top_100.append(main_data[i]['symbol'])

for ticker_symbol in nasdaq_top_100:
# Get stock data

# Calculate the date 10 years ago from today
ten_years_ago = datetime.now() - timedelta(days=3650)

# Fetch the historical data from yfinance
data = yf.download(ticker_symbol, start=ten_years_ago.strftime('%Y-%m-%d'))

# Insert the data into the MySQL table
insert_query = """
INSERT INTO nasdaq_100s (date, open, high, low, close, adj_close, volume)
VALUES (%s, %s, %s, %s, %s, %s, %s)
"""
for date, row in data.iterrows():
cursor.execute(insert_query, (date.date(), row['Open'], row['High'], row['Low'], row['Close'], row['Adj Close'], row['Volume']))

# Commit changes and close the connection
connection.commit()
cursor.close()
connection.close()

The distribution of the funds I bought is as follows for the 10 stocks

Use the imported data for graphical analysis

After organizing the data, it’s time to generate the charts

This is where all my money went: Microsoft 10%, Apple 10%, Nvidia 2%, Costco 2%, Broadcom 2%
Among them, MSFT does Azure cloud, Amazon does AWS, Google does G cloud. Three clouds, Nvidia AI, Costco supermarket, Broadcom chips, Meta social
How about looking at these stocks on an annual basis?

TiDB is essentially a database cloud company
Behind these data are essentially SQL statements

select FROM_UNIXTIME(date - date % 31536000) as stamp,avg(close) as v GROUP BY stamp order by stamp limit 100000

At the end of the article, I recommend a serious book

If you have gained something, give it a thumbs up

| username: Kongdom | Original post link

:+1: Step-by-step tutorial, forever the best

| username: zhaokede | Original post link

Got it!

| username: DBAER | Original post link

That’s really impressive.

| username: WalterWj | Original post link

I helped adjust the format a bit. Brother, you should learn some basic Markdown language.

| username: Kongdom | Original post link

: yum: Strongly recommend starting a special topic: Rebirth - Learning Markdown in TiDB

| username: WalterWj | Original post link

:sob::sob::sob:

| username: 大飞哥online | Original post link

Awesome

| username: jiayou64 | Original post link

A dedicated column should be created, named “The Road to Learning TiDB - Starting from Stock Analysis”.

| username: TIDB-Learner | Original post link

Is the stock market currently in a bull market or a bear market?

| username: changpeng75 | Original post link

Is the volume of stock data large enough to require the use of a distributed database?

| username: wangkk2024 | Original post link

Impressive :+1:t2:

| username: xiaoqiao | Original post link

:+1: :+1:

| username: TiDBer_21wZg5fm | Original post link

Python data scraping and warehouse analysis, 6

| username: 友利奈绪 | Original post link

All-rounder Niu bo i

| username: TiDBer_fbU009vH | Original post link

How is the result?

| username: cassblanca | Original post link

At first glance, it looks awesome.

| username: tidb狂热爱好者 | Original post link

Brother, you haven’t put all your assets into the stock market.
I put my effort wherever my assets are.

| username: changpeng75 | Original post link

There is no problem with using force. It’s just that the amount of data required is not large enough to necessitate the use of a distributed database. Traditional databases offer better performance, and the latency issues of distributed data can affect the efficiency of data analysis.

| username: 哈喽沃德 | Original post link

Is the expert Mr. Jiang?