Skip to content

AgneseG/Audio-Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Relevant code & data have been stored with the following structure:

mp3 download folder

Code to retrieve files from remote websites
  • Radio commercial download.ipynb
  • 30sec Music Download.ipynb
  • Dutch Speech Downloads.ipynb
  • Podcast download.ipynb
  • Artists2.txt → list of artists from the Million Song Dataset

Data folder

Contains all mp3 files + the chunked streams used in the predictions
  • Commercials
  • Music
  • Speech
  • Streams
    • Stream 1 chunks → .npy files (each one corresponds to a chunk)
    • Stream 1 chunks - 3 sec → .npy files
    • Stream 2 chunks → .npy files
    • Stream 2 chunks - 3 sec → .npy files

Chunks 2sec folder

  • Chunking process
    • Chunking process - Commercial.ipynb
    • Chunking process - Music.ipynb
    • Chunking process - Speech.ipynb
  • Train-Test split
    • Train-Test split.ipynb → code used to split test-train datasets for each category

    • Train → contains all .npy train files (each one corresponds to a chunk)

    • Test → contains all .npy test files

    • Speech over Music - Train → contains all .npy train files for the category Speech over Music

    • Speech over Music - Test → contains all .npy test files for the category Speech over Music

    • Train_chunks.npy → array with train files’ names

    • Train_labels.npy → array with the corresponding labels

    • Test_chunks.npy → array with test files’ names

    • Test_labels.npy → array with the corresponding labels

      Note: the last four .npy files were created to get suitable inputs to feed the Keras generator afterwards

  • NN model
    • Speech - Music - Commercial - Model building
      • Sequential NN.ipynb → NN with dense layers only
      • CNN 2D.ipynb → Convolution 2D NN
      • CNN_2D.h5 → final Keras model - Predictions
      • Stream 1 chunking process.ipynb
      • Stream 1 predictions.ipynb
      • Stream 2 chunking process.ipynb
      • Stream 2 predictions.ipynb
      • Stream 1 predictions.pdf → plot with Stream 1 predictions based on 3-categories model (true labels are attached on top of plot)
      • Stream 2 predictions.pdf

Chunks 3sec folder

Same structure as 'Chunks 2sec' with a further sub-division of the folder 'NN model' in
'Speech - Music - Commercial' and 'Speech - Music - Commercial - Speech over Music':
  • Chunking process
    • Chunking process - Commercial.ipynb
    • Chunking process - Music.ipynb
    • Chunking process - Speech.ipynb
    • Speech over music.ipynb → code to create speech over music chunks
  • Train-Test split
    • Train-Test split.ipynb → code used to split test-train datasets for each category

    • Train → contains all .npy train files (each one corresponds to a chunk)

    • Test → contains all .npy test files (each one corresponds to a chunk)

    • Speech over Music - Train → contains all .npy train files for the category Speech over Music

    • Speech over Music - Test → contains all .npy test files for the category Speech over Music

    • Train_chunks.npy → array with train files’ names

    • Train_labels.npy → array with the corresponding labels

    • Test_chunks.npy → array with test files’ names

    • Test_labels.npy → array with the corresponding labels

    • Train_chunks_4categories.npy → same as above but for the 4 categories

    • Train_labels_4categories.npy

    • Test_chunks_4categories.npy

    • Test_labels_4categories.npy

      Note: the last eight .npy files were created to get suitable inputs to feed the Keras generator afterwards

  • NN model
    • Speech - Music - Commercial - Model building
      • CNN 2D.ipynb → Convolution 2D NN
      • CNN_2D_3sec.h5 → final Keras model - Predictions
      • Stream 1 chunking process + predictions.ipynb
      • Stream 2 chunking process + predictions.ipynb
      • Stream1_3sec.pdf → plot with Stream 1 predictions based on 3-categories model (true labels are attached on top of plot)
    • Speech - Music - Commercial - Speech over Music Same structure as ‘Speech - Music - Commercial’.
Each notebook containes further explanations of the steps done.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages