Shanghai Second-Hand Housing Transaction Data Monitoring
In the past year or two, I've been paying close attention to transaction volume in the Shanghai second-hand housing market. Similar to stock market volume, it can help predict housing price trends.
When I was young, I was all about working, learning technology, and making money. I was chatting with a colleague once, and he told me the house he'd bought was worth 8 million. The average worker...
I was quite shocked to hear that. 😲 I thought I'd never be able to achieve that in my lifetime. Opportunities like these, capitalizing on the current economic boom, are rare. Looking back, when I really wanted to buy a house, the restrictions on singles living outside the city were in place. By the time I finally got my hukou and could buy, housing prices had already plummeted! As a result, I could only wait and see, as my wallet was tight.
I couldn't afford 8 million yuan, but I could always keep an eye on things. A duck knows when the river warms in spring; if you don't seize opportunities, you'll be a toiling laborer in your next life. Besides, I made more money trading the stock market last year than I did working, which was a fantastic feeling. You still have to go to work, unless you're truly wealthy.
Without further ado, Shanghai publishes pre-owned housing transaction data daily at https://www.fangdi.com.cn/old_house/old_house.html. It's ugly, but it's authoritative. If you disagree, the government can force you to pay more taxes. Every day at noon,
I'd open it and take a look. Over time, I found it difficult to access historical data, and it became a hassle to open the website every time. Therefore, I want to use a program to automatically collect data and monitor it in Prometheus. I can even set alarm conditions, such as an alarm if the daily transaction volume exceeds 1,500 sets.
Below is a record of my operation process:
1. Automatic Collection Program
The code is as follows:
1import requests
2from flask import Flask
3from prometheus_client import Gauge, generate_latest, CollectorRegistry
4
5app = Flask(__name__)
6registry = CollectorRegistry()
7# Define a Gauge metric to store second-hand house sales data
8sell_count_gauge = Gauge('shanghai_second_hand_house_sell_count',
9'Shanghai Second Hand House Sell Count', registry=registry)
10
11@app.route('/metrics')
12def metrics():
13try:
14# Create a session object for cookie management
15session = requests.Session()
16
17# First visit the main page to obtain the initial cookie
18main_url = 'https://www.fangdi.com.cn/old_house/old_house.html'
19main_headers = {
20'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36'
21
22session.get(main_url, headers=main_headers)
23
24# Define the request URL
25api_url = 'https://www.fangdi.com.cn/oldhouse/getSHYesterdaySell.action'
26
27# Set the request headers
28headers = {
29'Accept': 'application/json, text/javascript, */*; q=0.01',
30'Accept-Language': 'en-US,en;q=0.9',
31'Connection': 'keep-alive',
32'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8',
33'Origin': 'https://www.fangdi.com.cn',
34'Referer': 'https://www.fangdi.com.cn/old_house/old_house.html',
35'Sec-Fetch-Dest': 'empty',
36'Sec-Fetch-Mode': 'cors',
37'Sec-Fetch-Site': 'same-origin',
38'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36',
39'X-Requested-With': 'XMLHttpRequest',
40'sec-ch-ua': '"Not:A-Brand";v="24", "Chromium";v="134"', 'sec-ch-ua-mobile': '?0',
41'sec-ch-ua-platform': '"Linux"'
42}
43
44# Send a POST request using the session object
45response = session.post(api_url, headers=headers, data="")
46response.raise_for_status()
47
48# Parse the JSON response
49data = response.json()
50sell_count = data.get('sellcount')
51
52if sell_count is not None:
53# Set the Gauge metric value
54sell_count_gauge.set(sell_count)
55# Generate metrics data in Prometheus format
56metrics_data = generate_latest(registry).decode('utf-8')
57return metrics_data, 200, {'Content-Type': 'text/plain; version=0.0.4; charset=utf-8'}
58else:
59return "No transaction quantity information found", 500
60except requests.RequestException as e:
61return f"Request error: {e}", 500
62except ValueError:
63return "Unable to parse response JSON data", 500
64
65if __name__ == '__main__':
66app.run(host='0.0.0.0', port=8000)
The above program is very simple. After accessing the cookie, it simulates a browser to retrieve the target data on the page. Note the endpoint URL: https://www.fangdi.com.cn/oldhouse/getSHYesterdaySell.action. This URL may change in the future. What if the other party redesigns the page?
If they do, use your browser's debugging tools to find the changes. Newbies might ask, "How do you know there are so many headers?" In your browser, copy it as curl, then let the AI convert it to the format required by the Requests library. AI is well-suited for this kind of work, rarely makes mistakes, and is highly efficient.
If you don't understand something about the above program, you can ask the AI to explain it. After all, being good at using tools to solve problems is the key to success.
2. Deploy Prometheus to automatically collect data
The above code already generates metrics. There are numerous tutorials online for deploying Prometheus on your local computer, so I won't go into detail here.
The above program, the collection client, is also deployed locally. I use supervisord to manage it:
1➜ ershoufang sudo supervisorctl status shanghai
2[sudo] password for mephisto:
3shanghai RUNNING pid 781, uptime 2:12:17
4➜ ershoufang cat /etc/supervisor.d/shanghai.ini
5[program:shanghai]
6command=/home/mephisto/github/ershoufang/.venv/bin/python shanghai.py # Replace with the actual path to your Flask program
7directory=/home/mephisto/github/ershoufang # Replace with the path to your Flask program Program directory
8autostart=true
9autorestart=true
10stderr_logfile=/var/log/app.err.log
11stdout_logfile=/var/log/app.out.log
Confirm that both the client (collection program) and the server (Prometheus) are running:
1➜ ershoufang sudo ss -lntp | grep -E "python|prometheus"
2LISTEN 0 128 0.0.0.0:8000 0.0.0.0:* users:(("python",pid=781,fd=3))
3LISTEN 0 4096 *:9090 *:* users:(("prometheus",pid=796,fd=6))
No problem, the port is listening properly. You can also try accessing the corresponding service address with a browser.
3. Set the scraping frequency
This step is to configure the prometheus.yml file.
1➜ ershoufang sudo cat /etc/prometheus/prometheus.yml
2# my global config
3global:
4scrape_interval: 30m # Set the scrape interval to every 15 seconds. Default is every 1 minute.
5evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
6# scrape_timeout is set to the global default (10s).
7
8#Alertmanager configuration
9alerting:
10alertmanagers:
11- static_configs:
12- targets:
13# - alertmanager:9093
14
15# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
16rule_files:
17# - "first_rules.yml"
18# - "second_rules.yml"
19
20# A scrape configuration containing exactly one endpoint to scrape:
21# Here it's Prometheus itself.
22scrape_configs:
23# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
24- job_name: "prometheus"
25
26# metrics_path defaults to '/metrics'
27# scheme defaults to 'http'.
28
29static_configs:
30- targets: ["localhost:9090"]
31
32- job_name: 'shanghai_second_hand_house'
33scrape_interval: 1h
34static_configs:
35- targets: ['localhost:8000']
The last job, shanghai_second_hand_house , is what you need to add. Set the scraping interval to 1h, collecting data once an hour. This is too frequent. A charge of disturbing public order and a five-day detention wouldn't be a good idea. Don't waste public resources and be a law-abiding citizen. If you collect too much data and then shut down the system, it will cause trouble for everyone. You should understand.
4. Verify the Results
Open your browser at http://localhost:9090. The screenshot is as follows:
Please switch to the Graph view and use Stacked chart mode. For other settings, see the screenshot above.
The results may contain multiple sampling points with the same value for the same day. This isn't a big deal, as data is collected hourly. As long as you can see the daily transaction volume, it's sufficient.
We can see that recently, around 800 units have been sold on weekdays, with weekend sales increasing significantly to 1,400. Last year, the transaction volume was typically around 400-600 units, but I can't remember exactly how, so I wanted to save the data.
I set the data retention setting to 3 years: Storage retention 3y. I don't know how to set this, so I'm too lazy to type it out.
Of course, you can also send the information directly to your WeChat or email address instead of storing it in Prometheus. I just personally find this method more suitable for me.
If you have any further questions, please follow our WeChat official account and leave a message. In addition, the above interface also includes the transaction area and total transaction price, which means that the average price can be calculated. The program can be modified to deliver multiple indicators.
Copyright statement:
- All content that is not sourced is original., please do not reprint without authorization (because the typesetting is often disordered after reprinting, the content is uncontrollable, and cannot be continuously updated, etc.);
- For non-profit purposes, to deduce any content of this blog, please give the relevant webpage address of this site in the form of 'source of original text' or 'reference link' (for the convenience of readers).
See Also:
- OpenLDAP Monitoring
- Easy to get real-time results of celery tasks
- Mini console assembly notes
- Labwc replaces customized skin
- Practical automatic proxy configuration example
- Openvpn Example
- Fastapi WeChat public account development brief
- Quickly hide and call out the terminal
- Hysteria Science Internet Brief
- Greetd and greetd tutorial