- Listen to Morse code from a live wire
- Write down the series of short and long beeps on a piece of paper
- Hand the paper to someone who runs the code over to another tent, ultimately to be sent to the decryption device.
To start on all of this, I found a site with a bunch of example Morse code sound files, maintained by The National Association for Amateur Radio (there are a bunch of files here – http://www.arrl.org/code-practice-files).
To get these into R, I used the readWave function from Uwe Ligges’ very cool tuneR package. The only bother here was needing to convert the sound files from .mp3 to .wav first.. ultimately not that bad though.
Figuring out how to deal with the converted audio file was actually a lot of fun – after a few hours of tinkering, I arrived at a solution (copied below) which converts audio Morse code to text. It’s still pretty fragile because the example sound files I’ve been using are computer generated, so the function won’t really work on Morse code ‘in the wild’ yet.
If you’d like to give it a spin, try downloading one of the example morse code files e.g. http://www.arrl.org/files/file/Morse/Archive/10%20WPM/140625_10WPM.mp3
Next, convert the mp3 file to .wav (and make sure to change the file path for the sf.1 object to wherever the .wav file is stored on your machine), then execute my code. The solution starts with a bit of a strange stamp, but the rest should be pretty easy to read.
If you’ve got ideas for how to improve things, definitely let me know on twitter – I’m @M_T_Patterson.
Here’s the code:
#### initialize #### # clear workspace: rm(list = ls()) ## loading libraries: library(tuneR) library(RCurl) morseref.url <- getURL("https://raw.githubusercontent.com/MarkTPatterson/Blog/master/Morse/morseref.csv", ssl.verifypeer = FALSE) ref.df <- read.csv(text = morseref.url) # helper function: var_find = function(vec, t, s){ var.out = var(vec[(t-s):(t+s)]) return(var.out)} ## loading reference files ## note: you'll need to change the file path for the sf.1 file ## sound file can be downloaded here: ## http://www.arrl.org/files/file/Morse/Archive/10%20WPM/140625_10WPM.mp3 ## note: you'll need to convert the file to .wav for the function to work. sf.1 = readWave("C:/Users/Mark/Desktop/RInvest/Morse Code/Sound Files/140625_10WPM.wav") # defining the morse to text function: m.to.text.func = function(sound.file){ # read data into a dataframe df = data.frame(indx = 1:length(sound.file@left), vec = sound.file@left) # points to sample: sample.points = seq(from = 100, by = 100, to = length(df$vec)) # applying the variance finder at the sampled points: tiny.df = data.frame(var = sapply(sample.points, function(x){var_find(vec = df$vec, t = x, s = 50)})) # decide which points are 'on' tiny.df$on = as.numeric(tiny.df$var > 100000) tiny.df$indx = 1:nrow(tiny.df) # create a vector of changes in on: raw.vec = diff(tiny.df$on) # create indices for change instances -- these will be 1 and -1 beep.start.vals = which(raw.vec == 1) beep.stop.vals = which(raw.vec == -1) # converting indices to durations: beep.durs = beep.stop.vals - beep.start.vals pause.durs = beep.start.vals[-1] - beep.stop.vals[-length(beep.stop.vals)] ## note: for some files, there seems to be a few ## few beep durs that are only 1; for now, hard coding these out: beep.durs = beep.durs[beep.durs>1] pause.durs = pause.durs[pause.durs>1] ## recoding beep durs ## note: this step needs to take the beep.durs data and the pause.durs data ## and return duration barriers. ## first, creating pause barriers: raw.tab = table(pause.durs) pause.centers.raw = kmeans(as.numeric(names(raw.tab[raw.tab > 5])),3)$centers[,1] pause.centers = pause.centers.raw[order(pause.centers.raw,decreasing = F)] pause.levels = as.vector(pause.centers) # determining separator values: pause.sep.1 = mean(pause.levels[1:2]) pause.sep.2 = mean(pause.levels[2:3]) ## similar exercise for beep.durs: raw.tab = table(beep.durs) beep.centers.raw = kmeans(as.numeric(names(raw.tab[raw.tab > 5])),2)$centers[,1] beep.centers = beep.centers.raw[order(beep.centers.raw,decreasing = F)] beep.levels = as.vector(beep.centers) beep.sep = mean(beep.levels[1:2]) ## creating the letter and word end vectors: letter.ends = which(pause.durs > pause.sep.1) word.ends = which(as.numeric(pause.durs[pause.durs > pause.sep.1] > pause.sep.2) == 1) # recoding beep durations to long and short: beep.durs.let = beep.durs beep.durs.let[beep.durs.let > beep.sep] = "l" beep.durs.let[beep.durs.let < beep.sep] = "s" ## grouping the beep duration letters (l's and s's) into letters ## based on the letter ends vector empty.list = list() start.val = 1 for(i in 1:length(letter.ends)){ cur.points = beep.durs.let[start.val:letter.ends[i]] empty.list[[i]] = paste(cur.points,collapse = "") start.val = letter.ends[i] + 1 } letter.vec = unlist(lapply(empty.list, function(x){ref.df$letter[which(ref.df$code == x)]})) ## grouping letters into words based on word.ends vec: start.val = 1 empty.list = list() for(i in 1:length(word.ends)){ cur.points = letter.vec[start.val:word.ends[i]] empty.list[[i]] = paste(cur.points,collapse = "") start.val = word.ends[i] + 1 } ## saving as a new vector, with spacing: out = paste(unlist(empty.list),collapse = " ") return(out) } # examples: m.to.text.func(sf.1)
No comments:
Post a Comment