Image

Welcome to FakeSound demo website.

You can see some deepfake general audio here.

Certain semantic segments of them have been

located by the audio text grounding model and regenerated using generative models.

Test-easy The reconstructed audio is generated by the superior performance of audioldm2, with the reconstruction range limited to 1-4 seconds.

Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated


Test-hard The reconstructed audio is generated by the superior performance of audioldm2, without reconstruction range limited.

Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated


Test-zeroshot The reconstructed audio is generated by the unseen model audioldm1, without reconstruction range limited.

Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated



Ground truth


Grounded & Masked


Regenerated