Connect with us

Politics

Experiments in Fast Image Recognition on Mobile Devices – ReadWrite

Published

on

feature detection and matching for image recognition


Our journey in experimenting with machine vision and image recognition accelerated when we were developing an application, BooksPlus, to change a reader’s experience. BooksPlus uses image recognition to bring printed pages to life. A user can get immersed in rich and interactive content by scanning images in the book using the BooksPlus app. 

For example, you can scan an article about a poet and instantly listen to the poet’s audio. Similarly, you can scan images of historical artwork and watch a documentary clip.

As we started the development, we used commercially available SDKs that worked very well when we tried to recognize images locally. Still, these would fail as our library of images went over a few hundred images. A few services performed cloud-based recognition, but their pricing structure didn’t match our needs. 

Hence, we decided to experiment to develop our own image recognition solution.

What were the Objectives of our Experiments?

We focused on building a solution that would scale to the thousands of images that we needed to recognize. Our aim was to achieve high performance while being flexible to do on-device and in-cloud image matching. 

As we scaled the BooksPlus app, the target was to build a cost-effective outcome. We ensured that our own effort was as accurate as the SDKs (in terms of false positives and false negative matches). Our solutions needed to integrate with native iOS and Android projects.

Choosing an Image Recognition Toolkit

The first step of our journey was to zero down on an image recognition toolkit. We decided to use OpenCV based on the following factors:

  • A rich collection of image-related algorithms: OpenCV has a collection of more than 2500 optimized algorithms, which has many contributions from academia and the industry, making it the most significant open-source machine vision library.
  • Popularity: OpenCV has an estimated download exceeding 18 million and has a community of 47 thousand users, making it abundant technical support available.
  • BSD-licensed product: As OpenCV is BSD-licensed, we can easily modify and redistribute it according to our needs. As we wanted to white-label this technology, OpenCV would benefit us.
  • C-Interface: OpenCV has C interfaces and support, which was very important for us as both native iOS and Android support C; This would allow us to have a single codebase for both the platforms.

The Challenges in Our Journey

We faced numerous challenges while developing an efficient solution for our use case. But first, let’s first understand how image recognition works.

What is Feature Detection and Matching in Image Recognition?

Feature detection and matching is an essential component of every computer vision application. It detects an object, retrieve images, robot navigation, etc. 

Consider two images of a single object clicked at slightly different angles. How would you make your mobile recognize that both the pictures contain the same object? Feature Detection and Matching comes into play here.

A feature is a piece of information that represents if an image contains a specific pattern or not. Points and edges can be used as features. The image above shows the feature points on an image. One must select feature points in a way that they remain invariant under changes in illumination, translation, scaling, and in-plane rotation. Using invariant feature points is critical in the successful recognition of similar images under different positions.

The First Challenge: Slow Performance

When we first started experimenting with image recognition using OpenCV, we used the recommended ORB feature descriptors and FLANN feature matching with 2 nearest neighbours. This gave us accurate results, but it was extremely slow. 

The on-device recognition worked well for a few hundred images; the commercial SDK would crash after 150 images, but we were able to increase that to around 350. However, that was insufficient for a large-scale application.

To give an idea of the speed of this mechanism, consider a database of 300 images. It would take up to 2 seconds to match an image. With this speed, a database with thousands of images would take a few minutes to match an image. For the best UX, the matching must be real-time, in a blink of an eye. 

The number of matches made at different points of the pipeline needed to be minimized to improve the performance. Thus, we had two choices:

  1. Reduce the number of neighbors nearby, but we had only 2 neighbors: the least possible number of neighbors.
  2. Reduce the number of features we detected in each image, but reducing the count would hinder the accuracy. 

We settled upon using 200 features per image, but the time consumption was still not satisfactory. 

The Second Challenge: Low Accuracy

Another challenge that was standing right there was the reduced accuracy while matching images in books that contained text. These books would sometimes have words around the photos, which would add many highly clustered feature points to the words. This increased the noise and reduced the accuracy.

In general, the book’s printing caused more interference than anything else: the text on a page creates many useless features, highly clustered on the sharp edges of the letters causing the ORB algorithm to ignore the basic image features.

The Third Challenge: Native SDK

After the performance and precision challenges were resolved, the ultimate challenge was to wrap the solution in a library that supports multi-threading and is compatible with Android and iOS mobile devices.

Our Experiments That Led to the Solution:

Experiment 1: Solving the Performance Problem

The objective of the first experiment was to improve the performance. Our engineers came up with a solution to improve performance. Our system could potentially be presented with any random image which has billions of possibilities and we had to determine if this image was a match to our database. Therefore, instead of doing a direct match, we devised a two-part approach: Simple matching and In-depth matching.

Part 1: Simple Matching: 

To begin, the system will eliminate obvious non-matches. These are the images that can easily be identified as not matching. They could be any of our database’s thousands or even tens of thousands of images. This is accomplished through a very coarse level scan that considers only 20 features through the use of an on-device database to determine whether the image being scanned belongs to our interesting set. 

Part 2: In-Depth Matching 

After Part 1, we were left with very few images with similar features from a large dataset – the interesting set. Our second matching step is carried out on these few images. An in-depth match was performed only on these interesting images. To find the matching image, all 200 features are matched here. As a result, we reduced the number of feature matching loops performed on each image.

Every feature was matched against every feature of the training image. This brought down the matching loops down from 40,000 (200×200) to 400 (20×20). We would get a list of the best possible matching images to further compare the actual 200 features.

We were more than satisfied with the result. The dataset of 300 images that would previously take 2 seconds to match an image would now take only 200 milliseconds. This improved mechanism was 10x faster than the original, barely noticeable to the human eye in delay.

Experiment 2: Solving the Scale Problem

To scale up the system, part 1 of the matching was done on the device and part 2 could be done in the cloud – this way, only images that were a potential match were sent to the cloud. We would send the 20 feature fingerprint match information to the cloud, along with the additional detected image features. With a large database of interesting images, the cloud could scale.

This method allowed us to have a large database (with fewer features) on-device in order to eliminate obvious non-matches. The memory requirements were reduced, and we eliminated crashes caused by system resource constraints, which was a problem with the commercial SDK. As the real matching was done in the cloud, we were able to scale by reducing cloud computing costs by not using cloud CPU cycling for obvious non-matches.

Experiment 3: Improving the Accuracy

Now that we have better performance results, the matching process’s practical accuracy needs enhancement. As mentioned earlier, when scanning a picture in the real world, the amount of noise was enormous.

Our first approach was to use the CANNY edge detection algorithm to find the square or the rectangle edges of the image and clip out the rest of the data, but the results were not reliable. We observed two issues that still stood tall. The first was that the images would sometimes contain captions which would be a part of the overall image rectangle. The second issue was that the images would sometimes be aesthetically placed in different shapes like circles or ovals. We needed to come up with a simple solution.

Finally, we analyzed the images in 16 shades of grayscale and tried to find areas skewed towards only 2 to 3 shades of grey. This method accurately found areas of text on the outer regions of an image. After finding these portions, blurring them would make them dormant in interfering with the recognition mechanism. 

Experiment 4: Implementing a Native SDK for Mobile

We swiftly managed to enhance the feature detection and matching system’s accuracy and efficiency in recognizing images. The final step was implementing an SDK that could work across both iOS and Android devices like it would have been if we implemented them in native SDKs. To our advantage, both Android and iOS support the use of C libraries in their native SDKs. Therefore, an image recognition library was written in C, and two SDKs were produced using the same codebase. 

Each mobile device has different resources available. The higher-end mobile devices have multiple cores to perform multiple tasks simultaneously. We created a multi-threaded library with a configurable number of threads. The library would automatically configure the number of threads at runtime as per the mobile device’s optimum number.

Conclusion

To summarize, we developed a large-scale image recognition application (used in multiple fields including Augmented Reality) by improving the accuracy and the efficiency of the machine vision: feature detection and matching. The already existing solutions were slow and our use case produced noise that drastically reduced accuracy. We desired accurate match results within a blink of an eye.

Thus, we ran a few tests to improve the mechanism’s performance and accuracy. This reduced the number of feature matching loops by 90%, resulting in a 10x faster match. Once we had the performance that we desired, we needed to improve the accuracy by reducing the noise around the text in the images. We were able to accomplish this by blurring out the text after analyzing the image in 16 different shades of grayscale. Finally, everything was compiled into the C language library that can be used with iOS and Android.

Anand Shah

Ignite Solutions

Founder and CEO of Ignite Solutions, Anand Shah is a versatile technologist and entrepreneur. His passion is backed by 30 years of experience in the field of technology with a focus on startups, product management, ideation, lean methods, technology leadership, customer relationships, and go-to-market. He serves as the CTO of several client companies.

Politics

Fintech Kennek raises $12.5M seed round to digitize lending

Published

on

Google eyed for $2 billion Anthropic deal after major Amazon play


London-based fintech startup Kennek has raised $12.5 million in seed funding to expand its lending operating system.

According to an Oct. 10 tech.eu report, the round was led by HV Capital and included participation from Dutch Founders Fund, AlbionVC, FFVC, Plug & Play Ventures, and Syndicate One. Kennek offers software-as-a-service tools to help non-bank lenders streamline their operations using open banking, open finance, and payments.

The platform aims to automate time-consuming manual tasks and consolidate fragmented data to simplify lending. Xavier De Pauw, founder of Kennek said:

“Until kennek, lenders had to devote countless hours to menial operational tasks and deal with jumbled and hard-coded data – which makes every other part of lending a headache. As former lenders ourselves, we lived and breathed these frustrations, and built kennek to make them a thing of the past.”

The company said the latest funding round was oversubscribed and closed quickly despite the challenging fundraising environment. The new capital will be used to expand Kennek’s engineering team and strengthen its market position in the UK while exploring expansion into other European markets. Barbod Namini, Partner at lead investor HV Capital, commented on the investment:

“Kennek has developed an ambitious and genuinely unique proposition which we think can be the foundation of the entire alternative lending space. […] It is a complicated market and a solution that brings together all information and stakeholders onto a single platform is highly compelling for both lenders & the ecosystem as a whole.”

The fintech lending space has grown rapidly in recent years, but many lenders still rely on legacy systems and manual processes that limit efficiency and scalability. Kennek aims to leverage open banking and data integration to provide lenders with a more streamlined, automated lending experience.

The seed funding will allow the London-based startup to continue developing its platform and expanding its team to meet demand from non-bank lenders looking to digitize operations. Kennek’s focus on the UK and Europe also comes amid rising adoption of open banking and open finance in the regions.

Featured Image Credit: Photo from Kennek.io; Thank you!

Radek Zielinski

Radek Zielinski is an experienced technology and financial journalist with a passion for cybersecurity and futurology.

Continue Reading

Politics

Fortune 500’s race for generative AI breakthroughs

Published

on

Deanna Ritchie


As excitement around generative AI grows, Fortune 500 companies, including Goldman Sachs, are carefully examining the possible applications of this technology. A recent survey of U.S. executives indicated that 60% believe generative AI will substantially impact their businesses in the long term. However, they anticipate a one to two-year timeframe before implementing their initial solutions. This optimism stems from the potential of generative AI to revolutionize various aspects of businesses, from enhancing customer experiences to optimizing internal processes. In the short term, companies will likely focus on pilot projects and experimentation, gradually integrating generative AI into their operations as they witness its positive influence on efficiency and profitability.

Goldman Sachs’ Cautious Approach to Implementing Generative AI

In a recent interview, Goldman Sachs CIO Marco Argenti revealed that the firm has not yet implemented any generative AI use cases. Instead, the company focuses on experimentation and setting high standards before adopting the technology. Argenti recognized the desire for outcomes in areas like developer and operational efficiency but emphasized ensuring precision before putting experimental AI use cases into production.

According to Argenti, striking the right balance between driving innovation and maintaining accuracy is crucial for successfully integrating generative AI within the firm. Goldman Sachs intends to continue exploring this emerging technology’s potential benefits and applications while diligently assessing risks to ensure it meets the company’s stringent quality standards.

One possible application for Goldman Sachs is in software development, where the company has observed a 20-40% productivity increase during its trials. The goal is for 1,000 developers to utilize generative AI tools by year’s end. However, Argenti emphasized that a well-defined expectation of return on investment is necessary before fully integrating generative AI into production.

To achieve this, the company plans to implement a systematic and strategic approach to adopting generative AI, ensuring that it complements and enhances the skills of its developers. Additionally, Goldman Sachs intends to evaluate the long-term impact of generative AI on their software development processes and the overall quality of the applications being developed.

Goldman Sachs’ approach to AI implementation goes beyond merely executing models. The firm has created a platform encompassing technical, legal, and compliance assessments to filter out improper content and keep track of all interactions. This comprehensive system ensures seamless integration of artificial intelligence in operations while adhering to regulatory standards and maintaining client confidentiality. Moreover, the platform continuously improves and adapts its algorithms, allowing Goldman Sachs to stay at the forefront of technology and offer its clients the most efficient and secure services.

Featured Image Credit: Photo by Google DeepMind; Pexels; Thank you!

Deanna Ritchie

Managing Editor at ReadWrite

Deanna is the Managing Editor at ReadWrite. Previously she worked as the Editor in Chief for Startup Grind and has over 20+ years of experience in content management and content development.

Continue Reading

Politics

UK seizes web3 opportunity simplifying crypto regulations

Published

on

Deanna Ritchie


As Web3 companies increasingly consider leaving the United States due to regulatory ambiguity, the United Kingdom must simplify its cryptocurrency regulations to attract these businesses. The conservative think tank Policy Exchange recently released a report detailing ten suggestions for improving Web3 regulation in the country. Among the recommendations are reducing liability for token holders in decentralized autonomous organizations (DAOs) and encouraging the Financial Conduct Authority (FCA) to adopt alternative Know Your Customer (KYC) methodologies, such as digital identities and blockchain analytics tools. These suggestions aim to position the UK as a hub for Web3 innovation and attract blockchain-based businesses looking for a more conducive regulatory environment.

Streamlining Cryptocurrency Regulations for Innovation

To make it easier for emerging Web3 companies to navigate existing legal frameworks and contribute to the UK’s digital economy growth, the government must streamline cryptocurrency regulations and adopt forward-looking approaches. By making the regulatory landscape clear and straightforward, the UK can create an environment that fosters innovation, growth, and competitiveness in the global fintech industry.

The Policy Exchange report also recommends not weakening self-hosted wallets or treating proof-of-stake (PoS) services as financial services. This approach aims to protect the fundamental principles of decentralization and user autonomy while strongly emphasizing security and regulatory compliance. By doing so, the UK can nurture an environment that encourages innovation and the continued growth of blockchain technology.

Despite recent strict measures by UK authorities, such as His Majesty’s Treasury and the FCA, toward the digital assets sector, the proposed changes in the Policy Exchange report strive to make the UK a more attractive location for Web3 enterprises. By adopting these suggestions, the UK can demonstrate its commitment to fostering innovation in the rapidly evolving blockchain and cryptocurrency industries while ensuring a robust and transparent regulatory environment.

The ongoing uncertainty surrounding cryptocurrency regulations in various countries has prompted Web3 companies to explore alternative jurisdictions with more precise legal frameworks. As the United States grapples with regulatory ambiguity, the United Kingdom can position itself as a hub for Web3 innovation by simplifying and streamlining its cryptocurrency regulations.

Featured Image Credit: Photo by Jonathan Borba; Pexels; Thank you!

Deanna Ritchie

Managing Editor at ReadWrite

Deanna is the Managing Editor at ReadWrite. Previously she worked as the Editor in Chief for Startup Grind and has over 20+ years of experience in content management and content development.

Continue Reading

Copyright © 2021 Seminole Press.