Posts

No Escape

In the Economist article The world’s waste problem is growing fast , Mr. Oliver did an excellent job as a journalist: digging and gathering truth, putting them on the table, so everyone have to face it, think about it and try to fix it in different ways. The article is an introduction to Mr. Oliver’s book, Wasteland. In the book he wrote down what he saw in some special places, where he called wasteland, to disclose some shady corners in global and local economic network, such as plastic pollution, toxic colonialism, planned obsolescence, etc. Plastic pollution is the nearest problem related with our daily life. In China years ago the plastic bags are free in super markets and grocery stores. Then the government (at least in Beijing) enacted a law about mandatory paid plastic bags. From then on there’s a charge for plastic bags in super markets, while free in most small grocery stores. People are more and more aware of the plastic pollutions. However, at least by my observation, the ...

Mouth vs Brain

A couple of days ago I read 2 articles about LLM (large language model) and ChatGPT. The first one is How ChatGPT and Other LLMs Work—and Where They Could Go Next on Wired. It mainly discussed the following topic: The main idea behind the LLM: with some user-provided words (usually a quesion), LLM choose the most suitable following without understanding the meaning of the question and the sentence it emits. Technical lineage: neural network to transformer, then to LLM. How human participation promotes the model quality. Here the article introduced the basic idea of RLHF (reinforcement learning on human feedback), the silver bullet of ChatGPT. The second article How ChatGPT actually works contains far more technical detals. It explains why ChatGPT-4 is superior than its precursors (better alignment), the detailed steps of RLHF, how to evaluate the performance of a LLM, and finally the shortcomings of the methodology used in current model training. After reading above articles, ...

The Value of Group Sports

This afternoon we watched the movie "Slam Dunk". Besides the exciting offense and defense scenarios on the count and the delicate narrations of the relationship between family members, I have some new thoughts on sports and games. People in East Asia traditionally believe that intellectual training is the essence of education. Compared with it, sports are more or less optional and subordinate in the educational system. P.E. classes are frequently occupied by language, Math, or science classes. In junior and senior school, I was a fan of ping-pong. It took most of my spare time and even in class to read books, and journals about it, and practice basic and complex skills. Thirty years later, I have to say the young me missed some essential parts of games. Besides strategies, tactics, and skills, games are more about communication and cooperation. The basketball court is always full of physical contact, fierce confrontations, and even altercations and injuries. However, you must...

Life on Habitica

After installing Habitica on my phone, 2 weeks passed. It's really a wonderful tool, like AntennaPod and @Voice. which changed my lifestyle dramatically. I put a couple of items on it. Please forgive me for listing them out. Tasks in channel "Habits" have a plus button. Every time you finish a task, click the button to get the corresponding rewards (experiences and coins, with which you can buy food or equipment). The tasks here can also have an optional "minus" button. Every time you failed the task, click the button to accept a punishment (negative rewards). By far I've added only one habit: going to bed before 23:00. Tasks in channel "Daily" need to be practiced on a daily/weekly/monthly basis. You can choose how many times the task must be done in a time unit. Here's my daily task: Listen to an English/Japanese/Cantonese episode (podcast with AntennaPod, audiobook/magazine with Musicolet); Learning Japanese/Cantonese on Duolingo; Learning n...

Life as A RPG Game

We all have many ideas about our lives, learning new skills, keeping healthy, reading an interesting book, going to sleep earlier... However, as people say, buying a book is always easier than reading it. We scheduled so many wonderful plans while forgetting most of them. Year after year, we stay at the starting point, imagining a better version of ourselves in a parallel universe. Why can't we keep the promises and make real changes? An important factor is that, as time goes by, keeping promise becomes more and more boring, and replaced by new (while never practiced) plans. Why do our passions diminish day by day? Because we can't get feedback in everyday life, since fundamental changes need a long time to take effect. Here games provide us a perspective to solve the issue. Why our passion for games never fades? Because in games we get feedback very soon. If we can feedback as soon as possible on reading books, keeping exercising, they may not be abandoned as time goes by. Hab...

A Good Tech Writing Tool

Tech writing has some different features than general writing. For example, for technical documents, there're more or less some code blocks in the text. Their rendering is key to a better comprehension of the article. In many cases, the code blocks are "live" rather than static. They are executed and the outputs (calculation results, including texts and images) are inserted into the text, as a part of the article. Another key component of tech writing is math formulas, whose rendering is also very important. Nowadays MathJax is more prevalent than LaTeX. Other requirements include auto-numbered sections, cross-reference, writing in plain text (hence easily integrated with the version control system), and report generation in popular formats (such as HTML, PDF), etc. The tools or frameworks I've used for tech writing include Asciidoc, Rmarkdown, and Jupytext, with different pros and cons respectively. Today I found quarto  (based on pandoc). Its feature set fits perfec...

A Brave Chinese

 Today I listened a record of an old Chinese' comments about the relationship with Russia and the U.S. That's exactly what I wanna say. As what he said, in the Chinese modernization process, the U.S. played an important role. However, the CCP teach our kids to hate it, like what they do 70 years ago. What did we get after that? Now, they play the trick again. When I talked with my parents, I always feel pity and anger for them. They're very nice people. They don't hurt anyone. They obey all rules and orders, with or without consent. They worked hardly all their lives. They saved every cent. But what they got? Poor education, distorted worldview based on systematic lies. Thanks to CCP's state-of-the-art brainwashing, they hate the U.S., who they always trying to help us. They embrace Russia, even they always bully us. If you don't agree with me about who's friend, and who's enemy, please listen the old man's comments completely. Tell me he told lies.

ChatGPT as a Persoanl Translator

 After reading a good article  about the mechanism of ChatGPT, I got the idea spontaneously that test the translation ability of ChatGPT with it. So I input the following texts in the ChatGPT prompt: > Please translate the article at the following URL to Chinese: https://www.assemblyai.com/blog/how-chatgpt-actually-works/ > 请将以下网址的文章翻译成中文:https://www.assemblyai.com/blog/how-chatgpt-actually-works/ > 如何ChatGPT实际工作的解释 > 在自然语言处理和人工智能领域,OpenAI的GPT系列已经成为一个备受瞩目的成就。它被用于许多领域,包括自动问答、机器翻译和聊天机器人等,其中最著名的是ChatGPT。在这篇文章中,我们将探讨ChatGPT是如何工作的,以及它在聊天机器人应用中的优缺点。 > 什么是ChatGPT? > ChatGPT是一种预训练模型,用于生成人类般的对话。它由OpenAI团队开发,利用了其最新的语言处理技术,包括transformer、自监督学习和预训练模型。ChatGPT是GPT-3的一个变体,但是在实现上更加专注于聊天机器人的应用。 > 如何训练ChatGPT? > ChatGPT的训练方式和其他GPT模型一样,采用了自监督学习的方法。这种方法需要大量的文本数据来训练模型,以便它可以学习语言的结构、语法和语义等方面。ChatGPT使用的文本数据是从互联网上收集的,并且经过了清理和预处理,以便模型可以更好地学习这些数据。 > ChatGPT的工作原理是什么? > ChatGPT的工作原理非常简单。当用户输入一个问题或者对话时,ChatGPT会使用自然语言处理技术来理解用户的意图,并生成一个响应。这个响应是通过模型中已经训练好的文本生成技术生成的,它会尽可能地模仿人类的对话方式...

Pending Reading List

This list contains excellent books I wanted to finish, however finding it challenging to read alone. So my dear friend, if you are interested in any of them, please let me know. And let's experience a fantastic reading, thinking, discussion and finally transcending journey together. Which books will be added to this list? Well, first, they are masterpieces in their fields. Most of them have scores above 4 on Goodreads or 8.5 on Douban. Second, the reading experience is wonderful. The beautiful text and fabulous ideas of the author make me careless of how much time it takes on reading it. In fact, it deserved to be read more than once. That said, how to read a book together since reading is a personal activity? IMO, we should distinguish casual reading (no more difference with watching movies or playing games) and intellectual reading, which is the essence of intellectual life. Intellectual life is seldom discussed nowadays. But it's the best lifestyle I can imagine so far. It d...

On Podcast Clients

As a podcast fan, I subscribed to a bunch of wonderful channels. For news, there's BBC News, Up First of NPR, World Story of the Day, Today in Focus, The Daily of New York Times, Chinese News of NHK, On the Media of WNYC, etc. For current affairs commentary, there're The Foreign Affairs Interview, The Political Scene of New Yorker, bumingbai.net, cdt.media, Sinica, China Insider, The NPR Politics Podcast, BBC Documentary, etc. Other categories include movies and songs commentary, science and technology, storytelling, games, language learning, etc. On Android, AntennaPod and Google Podcast are the most used. On Linux, gPodder is a good choice. As open-source software, it downloads all episodes as MP3 files, which can be synced to a local folder easily. It works well in WSL, thanks to the excellent job done by the WLS team of MS. MP3 exporting is particularly important in China mainland, for many of the channels are blocked and only be shared with files rather than links. The com...

A New Path of Higher Education

I'm reading the book "Building the International University: Minerva and the Future of Higher Education" by Stephen M. Kossyln these days. It's really a mind-blowing book about higher education. We all want to live in a safe, civilized society, living a decent, wonderful life. But how? Some say we need to be rich. Others say we need freedom and democracy. As the time flows into 2023, we know they're not enough. Middle-class Chinese in big cities are rich, but they feel no safety. Russia is a democratic country after USSR collapsed. However, the Russians voted to elect a dictator who is ruling for more than 20 years and pushed them into an absurd, brutal war. Then how about providing Chinese democracy? That sounds good. But if you've lived in China long enough, you'll find that the Chinese today probably elect another dictator democratically, like what the Russians did 20 years ago. So what are the missing puzzles in their jigsaw? One of them is political v...

Endless Wars in Cybersapce

I'm planning to do some research on threat intelligence in the next several months. For me, it's a brand new area, and hopefully, something interesting could be found this year. Nowadays cyberattack is becoming more and more complex and subtle. An essential problem faced by defenders is, attackers know what they know, while they don't know what the attackers' plots are. Cyberspace is different from the physical world, in the regard that there's no clear boundary between attackers and defenders. In the physical world, we can build forts with walls and trenches. We can see where attackers are, and what weapons they have. When we are defeated, at least we know who attack us, and what is their target (no more than lands, treasures, and slaves). While in cyberspace, a perfect attack targeting information exfiltration leaves no clue that it ever exists. The key difference lies in the fact that physical property is exclusive. It can only belong to one person at a specific ...

Research Plan for ETA

Today is a warm, sunny weekend. The smell of spring fills in the air. After a year-long preparation of supporting techniques, I think it's time to draw some new pictures in the field of ETA research starting in December 2021. In the past 2022, I spent lots of time trying to answer the question: why does a model that works well on the training set degrade significantly in a new environment? Can we predict the degradation before the model's prediction? That's where the incremental learning and data/concept drift came in. In the first several months I focused on incremental learning algorithms, mainly on the Hoeffding tree implementation. However, the deployment style of incremental learning models is quite different from traditional batch learning models, which makes its adoption more difficult in production. The good news is these researches lead me to detect algorithms of concept drift, then generalize to other forms of drift: covariate and label drifts, and finally domain ...

Deployment of Console Applications

The first version of Python I used was 2.5.1 when I had no idea what "open source" was. After almost 20 years, Python is still young and becoming more perspective day by day. When I did my Ph.D. research jobs in BIT, Python is a meshing tool for Abaqus, a CAE software used in our lab. Its power lay in the automation of manual meshing workflow. After I got a job as a software developer, Python was used mostly as a powerful DevOps tool. When I began developing scripts for data science, it was the best choice with the help of Numpy and Pandas. Nobody denies Python's power as a scripting language. But a new area where it began to shine is application development and deployment. With the help of PEP 621, you can write Python codes in a concise way as a standalone application, version control and publish to code sharing platforms like github or bitbucket, and deploy it with `pipx`. There's no need to write setup.py anymore. Instead, you can use `poetry` or `pdm` to build a ...

The Joy of Reading in Natural Light

Nowadays I read books in paperbacks much less than ebooks. Not because I don't like reading paper books, but after reading for so many years on screen, my eyes are very sensitive to the brightness of reading materials. It's very hard for me to read characters when the material is not bright enough. Unfortunately, the light in my living room is not bright enough for me. This morning it was cloudy, while in the afternoon it got clear, and the blue sky and white clouds appeared. So I moved the desk near the window. To my surprise, the characters on paper books are much more clear and more comfortable to read than in artificial light. Maybe such a simple fact is not worthy to mention. But to me, a programmer spent too much time with the keyboard and screen, this reminds me of the memories of childhood, reading a book, or writing homework in the yard. The characters blurred at sunset. That's when we were going to have super. The window of my living room is east-forwarding, so to...