Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Rework image generation to improve performance #8821

Merged
merged 14 commits into from
Oct 25, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changeset/good-mirrors-bake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'astro': minor
---

Improved performance of optimized images generation at build time. Astro will now generate optimized images concurrently, which can significantly speed up build times for sites with many images. Additionally, Astro will now reuse the same buffer for all variants of an image, which should improve performance for sites websites with many variants of the same image, especially when using remote images.
Princesseuh marked this conversation as resolved.
Show resolved Hide resolved

No code changes are required to take advantage of these improvements.
7 changes: 6 additions & 1 deletion .eslintrc.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,12 @@ module.exports = {
'@typescript-eslint/array-type': ['error', { default: 'array-simple' }],
'@typescript-eslint/no-unused-vars': [
'error',
{ argsIgnorePattern: '^_', ignoreRestSiblings: true },
{
argsIgnorePattern: '^_',
varsIgnorePattern: '^_',
caughtErrorsIgnorePattern: '^_',
ignoreRestSiblings: true,
Comment on lines +21 to +24
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to the PR, but it was driving me nuts

},
],
'no-only-tests/no-only-tests': 'error',
'@typescript-eslint/no-shadow': ['error'],
Expand Down
1 change: 1 addition & 0 deletions packages/astro/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@
"mime": "^3.0.0",
"ora": "^7.0.1",
"p-limit": "^4.0.0",
"p-queue": "^7.4.1",
"path-to-regexp": "^6.2.1",
"preferred-pm": "^3.1.2",
"probe-image-size": "^7.2.3",
Expand Down
272 changes: 180 additions & 92 deletions packages/astro/src/assets/build/generate.ts
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
import { dim, green } from 'kleur/colors';
import fs, { readFileSync } from 'node:fs';
import { basename, join } from 'node:path/posix';
import PQueue from 'p-queue';
import type { AstroConfig } from '../../@types/astro.js';
import type { BuildPipeline } from '../../core/build/buildPipeline.js';
import { getOutDirWithinCwd } from '../../core/build/common.js';
import { prependForwardSlash } from '../../core/path.js';
import { getTimeStat } from '../../core/build/util.js';
import type { Logger } from '../../core/logger/core.js';
import { isRemotePath, prependForwardSlash } from '../../core/path.js';
import { isServerLikeOutput } from '../../prerender/utils.js';
import type { MapValue } from '../../type-utils.js';
import { getConfiguredImageService, isESMImportedImage } from '../internal.js';
import type { LocalImageService } from '../services/service.js';
import type { ImageMetadata, ImageTransform } from '../types.js';
import type { AssetsGlobalStaticImagesList, ImageMetadata, ImageTransform } from '../types.js';
import { loadRemoteImage, type RemoteCacheEntry } from './remote.js';

interface GenerationDataUncached {
Expand All @@ -23,15 +29,28 @@ interface GenerationDataCached {

type GenerationData = GenerationDataUncached | GenerationDataCached;

export async function generateImage(
type AssetEnv = {
logger: Logger;
count: { total: number; current: number };
useCache: boolean;
assetsCacheDir: URL;
serverRoot: URL;
clientRoot: URL;
imageConfig: AstroConfig['image'];
assetsFolder: AstroConfig['build']['assets'];
};

type ImageData = { data: Buffer; expires: number };

export async function prepareAssetsGenerationEnv(
pipeline: BuildPipeline,
options: ImageTransform,
filepath: string
): Promise<GenerationData | undefined> {
totalCount: number
): Promise<AssetEnv> {
const config = pipeline.getConfig();
const logger = pipeline.getLogger();
let useCache = true;
const assetsCacheDir = new URL('assets/', config.cacheDir);
const count = { total: totalCount, current: 1 };

// Ensure that the cache directory exists
try {
Expand All @@ -53,113 +72,182 @@ export async function generateImage(
clientRoot = config.outDir;
}

const isLocalImage = isESMImportedImage(options.src);

const finalFileURL = new URL('.' + filepath, clientRoot);
const finalFolderURL = new URL('./', finalFileURL);
return {
logger,
count,
useCache,
assetsCacheDir,
serverRoot,
clientRoot,
imageConfig: config.image,
assetsFolder: config.build.assets,
};
}

// For remote images, instead of saving the image directly, we save a JSON file with the image data and expiration date from the server
const cacheFile = basename(filepath) + (isLocalImage ? '' : '.json');
const cachedFileURL = new URL(cacheFile, assetsCacheDir);
export async function generateImagesForPath(
originalFilePath: string,
transforms: MapValue<AssetsGlobalStaticImagesList>,
env: AssetEnv,
queue: PQueue
) {
const originalImageData = await loadImage(originalFilePath, env);
Copy link
Member Author

@Princesseuh Princesseuh Oct 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, the fact that we load the image at the beginning even when all the transforms might be cached already is slower, especially for fully cached builds where there's a lot of different images with a small amount of variants. However, in practice I found that this didn't change much, probably that the rest of the improvements counteract this, or maybe hard drives are just fast enough now.

If we want to get it of this, what we could probably do is store a list of cached hashes and check first if all the transforms requested are already cached before loading the image, shouldn't be too hard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should apply your suggestions, probably in a follow-up PR. An I/O operation is usually more expensive than a memory operation, and even though Node.js is really great in I/O operations, I think we should try to avoid them if we can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree, but I think it's not as bad as it might seems because the file is already written in the same process, so it's presumably already in some sort of hot cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we also make originalImageData a function to lazily load it, and those who call in parallel would share the same cache? Vite has this utility (which could be generalized) to simplify a bit.

But also I think we can do it in a follow-up too if there's not a big perf impact currently.


for (const [_, transform] of transforms) {
queue.add(async () =>
generateImage(originalImageData, transform.finalPath, transform.transform)
);
}

await fs.promises.mkdir(finalFolderURL, { recursive: true });
async function generateImage(
originalImage: ImageData,
filepath: string,
options: ImageTransform
) {
const timeStart = performance.now();
const generationData = await generateImageInternal(originalImage, filepath, options);

const timeEnd = performance.now();
const timeChange = getTimeStat(timeStart, timeEnd);
const timeIncrease = `(+${timeChange})`;
const statsText = generationData.cached
? `(reused cache entry)`
: `(before: ${generationData.weight.before}kB, after: ${generationData.weight.after}kB)`;
const count = `(${env.count.current}/${env.count.total})`;
env.logger.info(
null,
Princesseuh marked this conversation as resolved.
Show resolved Hide resolved
` ${green('▶')} ${filepath} ${dim(statsText)} ${dim(timeIncrease)} ${dim(count)}`
);
env.count.current++;
}

// Check if we have a cached entry first
try {
if (isLocalImage) {
await fs.promises.copyFile(cachedFileURL, finalFileURL);
async function generateImageInternal(
originalImage: ImageData,
filepath: string,
options: ImageTransform
): Promise<GenerationData> {
const isLocalImage = isESMImportedImage(options.src);
const finalFileURL = new URL('.' + filepath, env.clientRoot);

return {
cached: true,
};
} else {
const JSONData = JSON.parse(readFileSync(cachedFileURL, 'utf-8')) as RemoteCacheEntry;
// For remote images, instead of saving the image directly, we save a JSON file with the image data and expiration date from the server
const cacheFile = basename(filepath) + (isLocalImage ? '' : '.json');
const cachedFileURL = new URL(cacheFile, env.assetsCacheDir);

// If the cache entry is not expired, use it
if (JSONData.expires > Date.now()) {
await fs.promises.writeFile(finalFileURL, Buffer.from(JSONData.data, 'base64'));
// Check if we have a cached entry first
try {
if (isLocalImage) {
await fs.promises.copyFile(cachedFileURL, finalFileURL);

return {
cached: true,
};
} else {
const JSONData = JSON.parse(readFileSync(cachedFileURL, 'utf-8')) as RemoteCacheEntry;
Princesseuh marked this conversation as resolved.
Show resolved Hide resolved

if (!JSONData.data || !JSONData.expires) {
await fs.promises.unlink(cachedFileURL);

throw new Error(
`Malformed cache entry for ${filepath}, cache will be regenerated for this file.`
);
}

// If the cache entry is not expired, use it
if (JSONData.expires > Date.now()) {
await fs.promises.writeFile(finalFileURL, Buffer.from(JSONData.data, 'base64'));

return {
cached: true,
};
} else {
await fs.promises.unlink(cachedFileURL);
}
}
} catch (e: any) {
if (e.code !== 'ENOENT') {
throw new Error(`An error was encountered while reading the cache file. Error: ${e}`);
}
// If the cache file doesn't exist, just move on, and we'll generate it
}
} catch (e: any) {
if (e.code !== 'ENOENT') {
throw new Error(`An error was encountered while reading the cache file. Error: ${e}`);
}
// If the cache file doesn't exist, just move on, and we'll generate it
}

// The original filepath or URL from the image transform
const originalImagePath = isLocalImage
? (options.src as ImageMetadata).src
: (options.src as string);

let imageData;
let resultData: { data: Buffer | undefined; expires: number | undefined } = {
data: undefined,
expires: undefined,
};

// If the image is local, we can just read it directly, otherwise we need to download it
if (isLocalImage) {
imageData = await fs.promises.readFile(
new URL(
'.' + prependForwardSlash(join(config.build.assets, basename(originalImagePath))),
serverRoot
const finalFolderURL = new URL('./', finalFileURL);
await fs.promises.mkdir(finalFolderURL, { recursive: true });

Princesseuh marked this conversation as resolved.
Show resolved Hide resolved
// The original filepath or URL from the image transform
const originalImagePath = isLocalImage
? (options.src as ImageMetadata).src
: (options.src as string);

let resultData: Partial<ImageData> = {
data: undefined,
expires: originalImage.expires,
};

const imageService = (await getConfiguredImageService()) as LocalImageService;
resultData.data = (
await imageService.transform(
originalImage.data,
{ ...options, src: originalImagePath },
env.imageConfig
)
);
} else {
const remoteImage = await loadRemoteImage(originalImagePath);
resultData.expires = remoteImage.expires;
imageData = remoteImage.data;
}

const imageService = (await getConfiguredImageService()) as LocalImageService;
resultData.data = (
await imageService.transform(imageData, { ...options, src: originalImagePath }, config.image)
).data;

try {
// Write the cache entry
if (useCache) {
if (isLocalImage) {
await fs.promises.writeFile(cachedFileURL, resultData.data);
} else {
await fs.promises.writeFile(
cachedFileURL,
JSON.stringify({
data: Buffer.from(resultData.data).toString('base64'),
expires: resultData.expires,
})
);
).data;

try {
// Write the cache entry
if (env.useCache) {
if (isLocalImage) {
await fs.promises.writeFile(cachedFileURL, resultData.data);
} else {
await fs.promises.writeFile(
cachedFileURL,
JSON.stringify({
data: Buffer.from(resultData.data).toString('base64'),
expires: resultData.expires,
})
);
}
}
} catch (e) {
env.logger.warn(
'astro:assets',
`An error was encountered while creating the cache directory. Proceeding without caching. Error: ${e}`
Princesseuh marked this conversation as resolved.
Show resolved Hide resolved
);
} finally {
// Write the final file
await fs.promises.writeFile(finalFileURL, resultData.data);
}
} catch (e) {
logger.warn(
'astro:assets',
`An error was encountered while creating the cache directory. Proceeding without caching. Error: ${e}`
);
} finally {
// Write the final file
await fs.promises.writeFile(finalFileURL, resultData.data);
}

return {
cached: false,
weight: {
// Divide by 1024 to get size in kilobytes
before: Math.trunc(imageData.byteLength / 1024),
after: Math.trunc(Buffer.from(resultData.data).byteLength / 1024),
},
};
return {
cached: false,
weight: {
// Divide by 1024 to get size in kilobytes
before: Math.trunc(originalImage.data.byteLength / 1024),
after: Math.trunc(Buffer.from(resultData.data).byteLength / 1024),
},
};
}
}

export function getStaticImageList(): Map<string, { path: string; options: ImageTransform }> {
export function getStaticImageList(): AssetsGlobalStaticImagesList {
if (!globalThis?.astroAsset?.staticImages) {
return new Map();
}

return globalThis.astroAsset.staticImages;
}

async function loadImage(path: string, env: AssetEnv): Promise<ImageData> {
if (isRemotePath(path)) {
const remoteImage = await loadRemoteImage(path);
return {
data: remoteImage.data,
expires: remoteImage.expires,
};
}

return {
data: await fs.promises.readFile(
new URL('.' + prependForwardSlash(join(env.assetsFolder, basename(path))), env.serverRoot)
),
expires: 0,
};
}
7 changes: 6 additions & 1 deletion packages/astro/src/assets/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,17 @@ export type ImageQuality = ImageQualityPreset | number;
export type ImageInputFormat = (typeof VALID_INPUT_FORMATS)[number];
export type ImageOutputFormat = (typeof VALID_OUTPUT_FORMATS)[number] | (string & {});

export type AssetsGlobalStaticImagesList = Map<
string,
Map<string, { finalPath: string; transform: ImageTransform }>
>;

declare global {
// eslint-disable-next-line no-var
var astroAsset: {
imageService?: ImageService;
addStaticImage?: ((options: ImageTransform) => string) | undefined;
staticImages?: Map<string, { path: string; options: ImageTransform }>;
staticImages?: AssetsGlobalStaticImagesList;
};
}

Expand Down
Loading