When implementing new software in your company, you often create user manuals on how to use it. Have you felt tedious manually taking screenshots for the steps and adding them to your manuals? In this article, you will lighten this heavy burden by delegating most of this work to AI. Before proceeding, you will need a Google account, and I assume you already created a folder to store a video on your computer.
Step-by-Step Guide
Preparation 1: Install FFmpeg on Your Computer
First, you need to install FFmpeg. Choose your operating system (OS) to get started.
- Press Windows key.
- In the search bar, enter "powershell."
- Right-click the Windows PowerShell icon.
- Choose "Run as an administrator."
- You will be asked for permission to make changes to your computer with the app. Click "Yes."
- Enter this command.
winget install --id=Gyan.FFmpeg -e
- Click Enter.
- FFmpeg will be installed.
- Press Windows + i.
- In the sidebar, click "System."
- Scrolling down the settings, you will see the section "About." Click it.
- After that, you will see the section "Device specifications." Scrolling the section, you will see a set of links labeled "Related links." Click "Advanced system settings."
- The System Properties pop-up will be opened. Under the section "Startup and Recovery," click "Environment Variables."
- Under "System Variables," you will see the variable "Path" in the table. Click it.
- You will see the list of the variables. Click one empty line.
- Enter this path in the textarea. For
user
, please replace it with your username on your computer.C:\Users\${user}\AppData\Local\Microsoft\WinGet\Packages\Gyan.FFmpeg_Microsoft.Winget.Source_8wekyb3d8bbwe
- Click "OK" beside the "Cancel" button.
- Click "×" in the top right corner of the Environment Variables popup.
- Click "×" in the top right corner of the System Properties popup.
- Go back to the PowerShell.
- Enter this command.
ffmpeg -version
- Click Enter.
- If this works correctly, you will see a response such as this, unless check
user
you filled in the environment variable is the same as your username.ffmpeg version 8.0-full_build-www.gyan.dev Copyright (c) 2000-2025 the FFmpeg developers built with gcc 15.2.0 (Rev8, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-lcms2 --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-libdvdnav --enable-libdvdread --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-liboapv --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-openal --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint --enable-whisper libavutil 60. 8.100 / 60. 8.100 libavcodec 62. 11.100 / 62. 11.100 libavformat 62. 3.100 / 62. 3.100 libavdevice 62. 1.100 / 62. 1.100 libavfilter 11. 4.100 / 11. 4.100 libswscale 9. 1.100 / 9. 1.100 libswresample 6. 1.100 / 6. 1.100 Exiting with exit code 0
For the guide, please read this article.
For the guide, please read this article.
Step 1: Record Steps in Your Manual in a Video
After you installed FFmpeg, record the steps you want to explain in the manual in a video.
- Record the video using the default screen recording tool on your computer. Choose your OS, and follow the steps.
- Set up the screen you want to record.
- Press Shift + S + Windows.
- Click the video button that looks like this, beside the button with a camera.
- Choose the screen area you want to include.
- Click "Start."
- After 3 seconds, recording will start.
- After recording, click the button with a red square.
For the guide, please read "Using Built-in Screen Capture Tool (Command + Shift + 5) — Easiest Way" in this article.
For the guide, please read this article.
- Move the video file to the folder you want to store the video in. If you don't know how to do this, here're ways for your OSs.
- Open Explorer (where you store files).
- Click the folder "Videos" in the sidebar.
- Click the folder "Screen Recordings."
- Select the video you want to move.
- Press Ctrl + X to cut the file.
- Move to the folder you want to store the video in.
- Press Ctrl + V to paste it in the folder.
For the guide, please read this article.
Step 2: Opt Out Your Chats from Machine Learning
Since your video contains sensitive information, opt out your chats from the machine learning of Gemini AI.
- Visit https://gemini.google.com.
- In the sidebar, click "Settings & help."
Source: https://gemini.google.com
- Click "Activity."
Source: https://gemini.google.com
- Click "On."
Source: https://myactivity.google.com
- Click "Turn off."
- Read the text if you want, and click "Got it."
Source: https://myactivity.google.com
- Click "Got it."
Source: https://myactivity.google.com
- Your chats are opted out.
Source: https://myactivity.google.com
Step 3: Extract Frames from the Video
After you record the video, extract frames for the steps using FFmpeg.
- Visit https://gemini.google.com.
- Enter this prompt template in the textbox in the center of the page.
[Task]: Generate FFmpeg commands that extract key frames from this video, which match before and after each process in the list below. [Operating System]: (Specify your OS) [List]: (Describe the steps in numbered lists) [Constraints]: - Output must only be a copyable code snippet. - The frames' file type must be PNG format. - Concatenate all commands with `;` if [Operating System] is "Windows" or "Linux" or `&&` if [Operating System] is "Mac." ```text (an FFmpeg command here) ; (another FFmpeg command here) ``` - When specifying a video file in an FFmpeg command, specify it in every command.
- Complete the bold placeholders enclosed with "(" and ")."
- Click "+" beside the button "Tools."
- Click "Upload files."
- Upload the video.
- Click "2.5 Flash."
Source: https://gemini.google.com
- Choose "2.5 Pro."
- Click Enter.
- FFmpeg commands will be generated.
- Click the copy button with 2 layered squares.
Source: https://gemini.google.com
- Go back to the PowerShell screen if Windows, the Xcode Command Line Tools screen if Mac, and the Bash screen if Linux.
- Enter this command. For
path
, replace it with the path of the folder where you stored the video.cd (path)
- Press Enter.
- Paste the copied command in the console (text-enterable area).
- Click Enter.
- Frames will be extracted from the video.
Step 2: Generate a Manual
- Visit https://gemini.google.com.
- Click "+."
- Click "Upload files."
- Upload the first 10 extracted frames.
- Click "Tools."
- Click "Canvas."
- Enter this prompt template in the textbox.
[Task]: Create a step-by-step HTML user manual. [Purpose]: (Explain the purpose) [Process To Explain]: (Copy and paste the same numbered lists as you wrote in the last prompt) [Target]: (Specify the target audience) [Constraints]: - Use clear yet simple language. - Avoid jargon unless necessary. - Use provided images to clearly illustrate processes. - When adding the uploaded images in the manual, every image file path must be only the file name plus `.png`, even when the file formats of these images are JPG. - If one numbered list contains more than one step, break them into single processes. - Use consistent, user-friendly design.
- Complete the bold placeholders enclosed with "(" and ")."
- Press Enter.
- A Manual will be generated. Please note that the frames may not be displayed correctly to match the file paths in the manual to the frames' paths, but it's not a problem.
Source: https://gemini.google.com
- In the same textbox, enter this prompt.
Here are the rest of images I forgot to upload. Using these images as well as the existing ones, recreate the manual.
- Click "+"
- Click "Upload files."
- Upload the rest of the frames.
- Press Enter.
- Repeat the 11th to 15th steps until you can upload all frames in the chat.
- Click "Code" beside the "Preview" button.
Source: https://gemini.google.com
- Select the entire code by pressing Ctrl + A if using Windows or Linux-first personal computers, or ⌘ + A if using Mac.
Source: https://gemini.google.com
- Visit this HTML code downloader.
⚠️Please note that this app is created by me, and I don't collect any personal information from you.
- Paste the copied code in the textbox.
- Click "Download HTML File."
- The HTML file will be downloaded.
Step 4: Check and Edit the Manual
Once your manual is created, check if there are missing steps and if the used screenshots are consistent with the texts. If needed, ask Gemini for correcting them.
Here's the example of a manual.
Benefits of Using AI-Generated Manuals
It's Fast.
Creating a manual with screenshots often takes hours. But if you bring in this heavy lifting to Gemini, you can make your idea a manual faster than full manual creation by humans, and you will only need to check and edit the manual. Thus, you can focus on more valuable tasks like the development of busineess improvment apps.
It's User-friendly.
First of all, your manual is targeting your audience, not you. One redditor complained that the user had a difficult time understanding technical language. But sometimes, the terminologies are useful because they are easier than explaining with long text. But don't worry. Gemini can rephrase your words in simpler terms. On top of that, it can break what you think is one process down into single, actionable steps. As a result, you will be able to reduce queries because of confusion about your manual.
It's Updated Faster.
Another redditor failed Proof of Concept, which verifies the feasibility of new ideas and concepts. One reason is that the documentation of the software "Foreman" didn't seem to keep up with the releases, leaving old screenshots. I felt that this product team could have updated their documentation with this AI hack then. By the addition of new screenshots, they could also have simplified their software. So that it won't happen these in your company, Gemini also helps you streamline your manual generation task.
Troubleshooting
Screenshots aren't displayed correctly.
If your screenshots aren't displayed correctly, ask Gemini with this prompt.
It seems some images aren't displayed correctly. Make sure the file names are the same as the uploaded images, and the file formats are `.png`.
In summary, this article tackled the problem with traditional manual creation by delegating this work to 2 Gemini AI models: Gemini 2.5 Flash and 2.5 Pro. If this information is helpful, I'm glad if you share your thoughts on social media. You can share this article by click the button with 3 connected dots below.
- https://www.reddit.com/r/linuxadmin/comments/e5cqx8/anyone_using_foreman/
- https://reddit.com/r/learnprogramming/comments/1b864zu/am_i_dumb_why_is_documentation_so_hard_for_me_to
- https://ai-economy-analysis.hatenablog.com/entry/2025/09/28/130851
- https://www.techsmith.com/blog/user-documentation/
- https://www.ntt.com/bizon/glossary/e-p/poc.html
Comments
Post a Comment