What is Real-ESRGAN?
Real-ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks) is an improved version of the original ESRGAN model, developed by Xinntao Wang. It's trained specifically on real-world degraded images — compression artifacts, blur, noise, low resolution — making it far more practical than earlier SR models that only handled simple bicubic downsampling.
The architecture uses Residual-in-Residual Dense Blocks (RRDB) as the generator, trained against a discriminator that learns what "real high-resolution" looks like. The result is an output that's convincingly sharp without over-sharpening or hallucinating texture that doesn't belong.
Scale factors
<strong>8×</strong> — For extreme cases: postage-stamp sized images, thumbnails, or very degraded low-resolution sources. Only available with Real-ESRGAN.
Face enhancement
Real-ESRGAN's optional face enhance mode uses GFPGAN to restore facial detail, sharpen eyes, and fix degraded skin texture alongside the upscaling pass.
When to choose Real-ESRGAN:
Speed is important, 8× scale is needed, or you're upscaling screenshots, illustrations, logos, anime, or text-heavy images where hallucinated detail would look wrong.