Page Object Model and Page Factory in Selenium

With the relentless pace of software development, automation testing has become an essential ally. Selenium webdriver has established itself as a leader in web application testing due to its versatility and effectiveness in handling structured, code-based environments. Its design patterns, particularly the Page Object Model (POM) and Page Factory, offer a streamlined way to manage web elements, making automated tests clearer and more maintainable.

However, despite its strengths, Selenium isn’t a universal solution. Complex UIs or applications without accessible object models can create roadblocks for testers. This blog explores how POM and Page Factory work in Selenium, examining their advantages and limitations, and finally, introduces T-Plan’s image-based approach as a way to automate even the most complex interfaces.

What is the Page Object Model (POM) in Selenium?

Overview of POM

The Page Object Model (POM) is a design pattern in Selenium that structures web UI elements into an organised object repository, making test automation cleaner and more manageable. Each web page is represented as a page class, while individual elements – like buttons, text fields and links – are defined as variables within that class. This setup, often referred to as a parent-child cascade of objects, provides a logical, hierarchical structure for managing elements within a project structure.

This parent-child structure mirrors the Document Object Model (DOM), with parent elements containing child elements in a clear hierarchy. By organising web elements into reusable objects, testers can interact with elements efficiently, cutting down on redundancy and enhancing modularity in the test code, thus reducing code duplication and simplifying test maintenance.

Key takeaways for using POM:

Reusable design: Ideal for applications with repetitive interactions in app testing.

Structured codebase: POM organises elements in ways that mimic the DOM, making testing setups intuitive and easy to manage.

Benefits of POM

The Page Object Model (POM) brings a structured approach to Selenium automation testing, providing several key benefits for automation:

Code reusability: A single page object can be used across multiple test cases. For example, if you’re testing a login page on a website, you can define it once and reuse it for different tests, from basic login verification to checking password requirements, all without duplicating code.

Maintainability: POM simplifies maintenance by centralising element definitions. When UI changes occur, testers only need to update the page objects rather than adjusting each individual test case. This is particularly helpful in agile environments, where UIs evolve frequently. Updates need to be made only once.

Readability and clarity: By organising elements within classes that represent each page, POM makes tests more intuitive. Testers can understand the workflow easily. Complex, element-heavy scripts are turned into straightforward, navigable code.

The problem with POM

While POM offers significant advantages, it also has limitations. Ever had trouble accessing an application’s code when setting up tests? That’s where POM can hit a wall.

Access to code: POM depends heavily on access to the application’s code and object model. Without this, Selenium struggles to locate elements for automation.

Interrogating objects: If elements don’t follow standard naming or lack recognisable paths in the DOM, Selenium can’t interact with them, limiting its effectiveness in complex or custom UIs.

Code example: POM structure

Here’s a simple example of how POM can structure a login page in Selenium:

// Page Object Model Example for a Login Page

public class LoginPage {

WebDriver driver;

// Constructor

public LoginPage(WebDriver driver) {

this.driver = driver;

PageFactory.initElements(driver, this); // Initialize elements

}

// Define WebElements using PageFactory

@FindBy(id = “username”)

WebElement usernameField;

@FindBy(id = “password”)

WebElement passwordField;

@FindBy(id = “loginButton”)

WebElement loginButton;

// Method to log in

public void login(String username, String password) {

usernameField.sendKeys(username);

passwordField.sendKeys(password);

loginButton.click();

}

This example demonstrates the simplicity of POM, where elements are defined and managed within a single class, creating a cleaner and more intuitive test framework.

What is Page Factory in Selenium?

Overview of Page Factory

Page Factory is an extension of the Page Object Model (POM) that simplifies the way web elements are defined in Selenium. It introduces the @FindBy annotation, which allows testers to initialise elements dynamically at runtime. Instead of manually setting up each element, Page Factory automatically handles initialisation when elements are required in a test.

This dynamic approach not only reduces boilerplate code but also enhances readability, making it easier to set up page objects without repetitive definitions.

Advantages of page factory

Page Factory brings several advantages to Selenium testing:

Ease of use: Automatic instantiation of elements reduces the need for verbose, manual setup, letting testers focus more on logic than setup. For instance, if you’re testing a form submission page, the @FindBy annotation lets you quickly define form elements, making setup faster and easier.

Faster execution: By initialising elements only when they’re needed, Page Factory optimises test execution time and minimises resource use. This is ideal for tests that involve multiple actions in sequence, as elements are loaded only at the moment they’re called. It streamlines each step.

Limitations of page factory

While Page Factory adds convenience, it also has limitations:

Dependency on DOM structure: Like POM, Page Factory relies on the application’s object model. If elements aren’t properly mapped in the DOM, Page Factory won’t be able to locate them, limiting its effectiveness in certain applications.

Challenges with dynamic or complex UIs: If the UI changes frequently or lacks a well-defined path in the DOM, Page Factory’s reliability can be compromised, making it harder to automate tests.

Code example: Using page factory

Below is an example of a login page using Page Factory to define elements in Selenium:

// Using Page Factory for the same Login Page

public class LoginPage {

WebDriver driver;

// Constructor

public LoginPage(WebDriver driver) {

this.driver = driver;

PageFactory.initElements(driver, this); // Initialize elements automatically

}

// Define WebElements with @FindBy annotation

@FindBy(id = “username”)

WebElement usernameField;

@FindBy(id = “password”)

WebElement passwordField;

@FindBy(id = “loginButton”)

WebElement loginButton;

// Login method

public void login(String username, String password) {

usernameField.sendKeys(username);

passwordField.sendKeys(password);

loginButton.click();

}

In this example, the @FindBy annotation takes care of initialising elements, demonstrating how Page Factory can simplify and streamline test scripts.

When Selenium fails – the need for a hybrid approach

The scenario without a page object model

Imagine trying to automate a custom UI without proper identifiers – ever struggled with unrecognised UI elements? When UI elements lack identifiers like IDs, class names or paths within the DOM, even advanced setups with POM or Page Factory can fall short. These elements become unlocatable for Selenium, effectively halting the automation process.

Real-world example

Consider a scenario where you need to test a web application filled with new unique custom elements – widgets, custom buttons or dynamic content that isn’t mapped in the DOM with identifiable paths. Without IDs or distinct attributes for these elements, Selenium cannot automate interactions, creating a frustrating barrier for testers. When traditional approaches fail, alternative solutions become essential.

The solution: A hybrid approach

Enter T-Plan, an image-based testing tool that complements Selenium’s limitations by identifying elements visually rather than relying on DOM structures. Unlike Selenium’s code-based approach, T-Plan interacts with on-screen elements directly, making it an ideal tool for automating applications with complex or unstructured UIs.

By combining Selenium’s object-based testing with T-Plan’s image-based capabilities, testers can achieve a hybrid solution. This approach leverages Selenium’s strengths in handling accessible DOM elements while using T-Plan to automate interactions that would otherwise be impossible.

For more information on how T-Plan’s image-based testing can enhance your automation strategy, visit our image-based automated testing page.

T-Plan’s image-based testing vs. Selenium’s object-based approach

Image-based testing with T-Plan

Unlike Selenium, which relies on the DOM to locate and interact with elements, T-Plan’s image-based testing interacts with applications by visually identifying on-screen elements. This approach allows T-Plan to function as a human tester would, recognising elements based solely on appearance rather than underlying code structures.

This image-based technique is particularly beneficial in environments with complex UIs, multimedia applications, custom controls or virtualised environments like Citrix. For testing scenarios that Selenium cannot fully address, T-Plan provides a robust alternative.

Key benefits of T-Plan

No dependency on code: T-Plan operates independently of the application’s underlying code. It can interact with any visible on-screen object, making it ideal for inaccessible or highly customised UIs.

Testing in complex environments: When faced with applications that have unstructured or inaccessible elements, T-Plan provides a way to automate tests without reliance on the DOM. This helps improve testing efficiency and reliability, enhancing overall test management processes.

Pixel-perfect accuracy: T-Plan’s visual interaction approach ensures precision in testing, which is particularly valuable for applications involving animations, graphical interfaces or multimedia content.

Visit our features page for more details on T-Plan’s capabilities.

Limitations of Selenium in complex scenarios

While Selenium excels in structured environments, it encounters challenges in complex scenarios:

DOM reliance: Selenium depends on well-structured DOM elements with accessible paths. In cases where elements don’t follow this structure, automation breaks down.

Challenges with dynamic elements: For applications where UI elements shift, resize or appear/disappear dynamically, Selenium struggles to maintain consistent automation, affecting test reliability.

Hybrid approach in practice

By combining Selenium and T-Plan, testers can implement a hybrid approach that leverages the strengths of each tool:

Selenium handles interactions with structured, code-accessible elements, such as filling forms, clicking buttons and navigating web pages.

T-Plan takes over in complex scenarios, enabling automation for custom UI elements, visual content or applications that don’t follow a standard DOM structure.

Example workflow:

Selenium initiates the test, interacting with structured elements as usual.
When Selenium encounters an unrecognised element, T-Plan steps in, using image-based recognition to locate and interact with the element.
The test concludes seamlessly, covering both object-based and image-based interactions.

Hybrid approach in action – code example: combining Selenium and T-Plan

Here’s an example of how a hybrid approach might look in code:

// Hybrid Approach Example

public void hybridTest() {

try {

// Selenium-based interaction

driver.findElement(By.id(“standardElement”)).click();

} catch (NoSuchElementException e) {

// Fallback to T-Plan for dynamic element interaction

captureWindow(“Application”);

clickOnImage(“dynamicElementImage.png”);

}

In this example, Selenium performs the initial interactions, while T-Plan provides backup for dynamic elements that Selenium cannot access directly.

When to choose a hybrid approach over pure Selenium

Key indicators for a hybrid approach

Choosing a hybrid approach with both Selenium and T-Plan can significantly enhance testing in specific scenarios. Here are key indicators that suggest a hybrid approach is the right choice:

Lack of access to code: When testers don’t have direct access to the application’s code or object model, Selenium’s DOM-dependent methods may be insufficient. T-Plan’s image-based testing allows automation even when the underlying code is inaccessible.
Custom-built UIs: Applications that use non-standard frameworks, custom widgets or graphical interfaces often don’t follow traditional DOM structures, making Selenium less effective. T-Plan complements Selenium by providing a solution for these unique environments.
Dynamic or multimedia applications: If an application contains multimedia elements, complex animations or components that require pixel-level accuracy, T-Plan’s visual precision ensures that even the smallest visual changes are detected.
Virtualised or remote environments: In virtualised setups, such as Citrix, where applications are accessed remotely and the visual display is the only available interaction point, T-Plan’s image-based testing fills the gap left by Selenium.

The power of flexibility

Selenium’s Page Object Model and Page Factory are powerful tools for web automation, bringing structure and efficiency to UI testing. However, they have limitations, particularly in environments without accessible object models. A hybrid approach with T-Plan and Selenium bridges these gaps, offering greater flexibility and coverage across diverse testing scenarios.

Final takeaway: Leveraging Selenium’s strengths in code-based testing alongside T-Plan’s image-based capabilities allows testers to cover a broader range of UI elements, enhancing test accuracy and reducing the risk of missed issues. For more details on integrating this hybrid approach into your workflow, visit our Selenium integration guide.

Conclusion – The power of flexibility

With development cycles picking up speed and competition growing, automation testing is vital to ensure the quality and performance of web applications. Selenium’s Page Object Model and Page Factory are powerful tools for web automation, bringing structure and efficiency to UI testing. However, they have limitations, particularly in environments without accessible object models. The hybrid solution with T-Plan leverages the strengths of both methods, optimizing automation efforts and enhancing the overall testing process, regardless of the browser or environment used.

Adopt this hybrid approach to enhance your testing capability, save time and achieve greater test coverage for higher ROI. Ready to take your testing strategy to the next level? Request your free trial and experience first-hand how T-Plan can support your automation needs. Let’s transform your testing approach together – because at T-Plan, we’re here to help you navigate every challenge in automation.