[SwiftUI] Vocamera
1. Requirements
Project Objectives
While studying English, I found it troublesome to write down all of English words that I didn’t know in the book. I thought it would be useful if there was an application that made a page of a book into an English vocabulary. I looked it up in the App Store, but there was no such application, so I made it myself!
Specification
version 1.0.0
- Vocabulary
- Create
- Edit
- Delete
- Vocabulary Note
- Create
- Edit
- Delete
- Get Image from
- Library
- Camera
- Convert Image to Words
- Extract only the vocabulary of all words
2. Structure/Function
version 1.0.0
Roughly, Vocamera version 1.0.0 consists of the following:
- 2 Models: Voca, VocaNote Entities - Core Data Classes & Properties
- 4 Views: Voca, VocaNote, AddVocaNote, ImageRecognize
- Image Process structs: Library, Scanner
- TextRecognizer Class
- TextProcess Class
- Others: Utilities, Persistence…
- (1) Models
- Each entity has a Class and Properties.

- (2) Views
ImageRecognizeView
can access to the photo library or camera.




- (3) Image Process Structs
- Each struct has its
Coordinate
Class and load PhotoLibrary or Scanner. - (4) TextRecognizer Class, TextProcess Class:
TextRecognizer
Class get UIImages or CGImages and transform them into texts.TextProcess
Class removes whitespace, unecessary words.- (5) Others
UtilityVies
,UtilityExtensions
,Persistence
…
3. Implementation: Focus on the Problem-Solving
: Among the implementations, parts that take a long time or still have a room for improvement are summarized with the source.
version 1.0.0
In version 1.0.0, the followings are the main features that were difficult(or first) to implement.
➀ Use CoreData
➁ Use various View(Navigation View, fullScreenCover, sheet…) and personal EditMode
➂ Take a pictures or import an existing images and convert it into text.
➃ Trim text extracted from the image.
⑤ Determine a valid words in trimmed text.
The codes below are the main functions needed to implement these features.
Main problems
- Transform Image to Texts
- Handling with texts
Functions
➀ View: Switch Screen, Presonal EditMode
: Navigation View, fullScreenCover, sheet…
- Personal EditMode
@Environment(\.editMode) private var editMode
@State private var mode: EditMode = .inactive
...
...
...
.onChange(of: mode, perform: { newMode in
if newMode == .active {
editNote = EditNote(vocaNote: parentVocaNote, viewContext: viewContext)
} else if newMode == .inactive {
if parentVocaNote.title.count < 1 {
parentVocaNote.title = editNote.beforeNoteTitle
showNoteTitleAlert = true
self.mode = .active
}
}
})
.navigationBarItems(leading: mode == .active ? cancelButton : nil)
.environment(\.editMode, $mode)
...
...
...
➁ Transform Images to text
- Input: UIImages or CGImages (In case of UIImage, transform into CGImage)
recognizeText
function: Transform CGImage into text usingDispatchQueue
andcompletionHandler
.
func recognizeText(withCompletionHandler completionHandler:@escaping ([String])-> Void) {
queue.async {
var cgImages = [CGImage]()
if self.uiImages == nil {
cgImages = (0..<self.cameraScan!.pageCount).compactMap({
self.cameraScan?.imageOfPage(at: $0).cgImage
})
} else if let uiImages = self.uiImages {
for uiImage in uiImages {
let ciImage = CIImage(image: uiImage)!
let ciContext = CIContext(options: nil)
let cgImage = ciContext.createCGImage(ciImage, from: ciImage.extent)
cgImages.append(cgImage!)
}
}
let imagesAndRequests = cgImages.map({(image: $0, request:VNRecognizeTextRequest())})
let textPerPage = imagesAndRequests.map{image,request->String in
let handler = VNImageRequestHandler(cgImage: image, options: [:])
do{
try handler.perform([request])
guard let observations = request.results else { return "" }
return observations.compactMap({$0.topCandidates(1).first?.string}).joined(separator: "\n")
}
catch{
print("Failed to recognize the text: ", error)
return ""
}
}
DispatchQueue.main.async {
completionHandler(textPerPage)
}
}
}
➂ Trim text & Validation
- Trim: Using
NLTagger
’s options & personal NLTag extesion.
extension NLTag {
func isBeHaveVerb(word: String) -> Bool {
if NLTagConstants.beVerb.contains(word) || NLTagConstants.haveVerb.contains(word) || NLTagConstants.otherEraseWords.contains(word) { return true }
else { return false }
}
func containsNumbers(word: String) -> Bool {
let numberSet = NSCharacterSet.decimalDigits
if word.rangeOfCharacter(from: numberSet) != nil { return true }
else { return false }
}
...
...
...
}
- Validation:
NLtag
’senumerateTags
...
...
...
tagger.enumerateTags(in: sentences.startIndex..<sentences.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in
let word = "\(sentences[tokenRange])"
guard let tag = tag, word.count > 1, !tag.containsNumbers(word: word), !tag.isBeHaveVerb(word: word), tag != .conjunction, tag != .number, tag != .pronoun ,tag != .preposition else { return true }
words.append(word)
return true
}
4. Result
version 1.0.0
-
period: May 15, 2022 ~ July 6, 2022
-
Appstore
- Vocamera 1.0.0 (video)
Comments