Read Contents of File to Write Into Textarea Selenium Python

Filling in Web Forms

Web forms are ubiquituous when surfing the net. A web form comprises web elements such as input boxes, check boxes, radio buttons, links, drop down menus, and submit buttons to collect user information. To process web forms, nosotros need to outset find these web elements and then have subsequent actions on them similar selecting a value or entering some text. Selenium has an API that helps us to do that. Since we accept covered how to find web element(due south) using Selenium selectors in the previous chapter, this chapter focuses on accessing forms in Selenium: performing actions on and interacting with the forms. Allow us see how different actions can be performed on each type of web field that may exist involved in a web form. We utilise this dynamic consummate search form webpage to illustrate nearly of the examples used in this chapter. Below is a screenshot of how this website uses forms:

Input box

The input box is the text box that displays user input. To handle any input box, nosotros must be able to enter information, articulate information or become information from the box. To enter text into a textbox nosotros can employ the send_keys() method, which would input the user required text from our automation script. The post-obit code enters the educatee name starting with "A" into a text box whose ID is "search_name":

                              driver.find_element_by_id('search_name').send_keys("A")

To clear pre-entered text we can utilise the clear() method. For example:

                              driver.find_element_by_id('search_name').clear()

If we demand to validate some existing text, we tin fetch text already in a text box using the get_attribute() method:

                              nameText                  =                  commuter.find_element_by_id('search_name').get_attribute("value")

Cheque box

A cheque box is a small box that enables us to check or uncheck it. To select or check a value we use the click() method. Information technology simply changes the state from unchecked to checked and vice-versa. For example, the lawmaking below clicks on the Accept Privacy Policy checkbox:

                              driver.find_element_by_id('privacypolicy').click()

Dealing with checkboxes is not always so straightforward. We may need to select a checkbox only when it is not selected already. Or, we may want to deselect a checkbox only when it is already selected. If we are trying to select a checkbox, simply we perform a click operation on an already selected checkbox, then the checkbox will be deselected; something we do not desire to happen. So, we first need to validate whether the checkbox is selected or non. To get the current country of the checkbox we tin apply one of two methods: is_selected() or get_attribute("checked"). For case, using is_selected():

                              privacy_boolean                  =                  driver.find_element_by_id('privacypolicy').is_selected()                                  print(privacy_boolean)                                  print(type(privacy_boolean))

This will return a boolean value. This means if the checkbox is checked nosotros would become a True value else we will go False value. Alternatively, using get_attribute("checked"):

                              privacy_other                  =                  commuter.find_element_by_id('privacypolicy').get_attribute("checked")                                  impress(privacy_other)                                  print(type(privacy_other))

This will return Truthful if the checkbox is selected, but volition return NoneType if the checkbox is not selected. The post-obit code shows how to deselect a checkbox just when information technology is selected:

                              commuter.find_element_by_id('privacypolicy').click()

Radio button

A radio button is a circular element on the screen that we tin can select. A radio button is similar to a checkbox, but it is only possible to select i radio button out of several choices, while we can select multiple checkboxes. The actions performed on the radio button are like to those on a checkbox and nosotros can apply the same methods as above to select a radio button or validate its condition of option: click() and is_selected() / get_attribute("checked"). The code beneath provides an case:

                              driver.find_element_by_id('p5').click()                pageSize_5_boolean                  =                  commuter.find_element_by_id('p5').is_selected()                                  print(pageSize_5_boolean)                                  impress(type(pageSize_5_boolean))                                pageSize_5_other                  =                  driver.find_element_by_id('p5').get_attribute("checked")                                  print(pageSize_5_other)                                  print(type(pageSize_5_other))

Radio buttons practice non support deselection. To deselect a radio button, i needs to select any other radio button in that grouping. If we select the same radio push trying to deselect it, we will get the aforementioned selection as before; cipher will change. The following lawmaking shows how to deselect a radio button only when information technology is selected:

                              driver.find_element_by_id('p10').click()

Link

A link redirects u.s.a. to a new web folio or a new popular-up window or a similar affair. We tin can use two mechanisms to navigate to a new screen or a new pop up or a new course: we tin either do a click action on a link chemical element we find, or get the new URL from the link element we observe and then navigate to it. Here is an example of operating the link embedded in the example webpage using the first mechanism:

                              commuter.find_element_by_id("privacy_policy").click()

A link's URL is generally embedded in the link chemical element we find with a <a> tag name as the "href" property. Instead of direct clicking on the link, we can utilise the get_attribute() method. Hither is the same example using the 2d mechanism:

                              privacy_object                  =                  driver.find_element_by_id("privacy_policy")                privacy_link                  =                  privacy_object.get_attribute("href")                driver.become(privacy_link)

Dropdown

A dropdown is a list which has an arrow at the rightmost cease to expand and testify values. It provides a listing of options to the user, thereby giving access to one or multiple values as per the requirement. To piece of work with a dropdown, commencement we need to select or notice the main element group and then go inside further and select the sub-chemical element that we want to capture. The Selenium Python API provides the Select class, which allows us to select the element of our selection. Note that the Select grade only works with tags that have <select> tags. Nosotros tin select a sub-element of the dropdown using: 1) index, 2) value, or 3) text.

If the dropdown has an "index" aspect, then we can employ that index to select a detail pick. We need to exist careful when using this approach, considering it is not uncommon to take the index start at 0. We can use the select_by_index() method to select an selection using the "index" aspect. For instance, we desire to select the 5th grade students:

                              grade_dropdown                  =                  Select(driver.find_element_by_id("search_grade"))                grade_dropdown.select_by_index(vi)

If the HTML mark-up defines an <selection> tag, then we can use the value matching the argument. Suppose the HTML for dropdown is similar this:

            <td>     <select id="search_grade">         <option selected>(no value)</choice>         <option value="K">K</choice>         <option value="1">ane</option>         <option value="2">2</option>         <option value="three">3</choice>         <pick value="4">4</choice>         <pick value="5">5</choice>     </select> </td>

We can use the select_by_value() method to select an option using the "value" attribute.

                              grade_dropdown.select_by_value("v")

Probably the easiest way of selecting a sub-element is to select an element using the text of the dropdown. We have to match the text that is displayed in the dropdown using the select_by_visible_text() method. For example:

                              grade_dropdown.select_by_visible_text("5")

In a like way, we tin can deselect whatever selected value from the dropdown using any of the following options: 1) deselect_by_index(), ii) deselect_by_value(), or iii) deselect_by_visible_text(). These methods can exist used only when nosotros make multiple selections. The deselect_all() method clears all the selected options. This is besides simply applicative when there are multiple selections. If we try to use this when in that location is a unmarried selection, it will throw a NotImplementedError exception.

In that location are times while performing Selenium automation of our spider web app, where nosotros need to validate the options coming in our dropdown list. The Select class provides holding methods that let us to do this. The offset two property methods are applicable when we can select multiple options.

.all_selected_options — Get the list of all the selected options.
.first_selected_option — Return the first option that has been selected from the dropdown and unlike the above method information technology would return a single web element, not a list.
.is_multiple — Return True if the dropdown allows multi-option and return NoneType otherwise.
.options — Go a list of all available options in a dropdown.

Buttons

Buttons are simply used to submit whatsoever information nosotros have filled in our forms to the server. This can be done through click deportment on the button, by and large using the click() method:

                              commuter.find_element_by_id("search").click()

Demos

Using our dynamic complete search form webpage, nosotros will examine a uncomplicated program that handles all the types of form elements we have covered so far. Second, we will examine an advanced program that handles the situation where we will need to admission this form not once but many times sequentially.

Fill in the form in one case

Suppose that we desire to search all the 5th form students whose names start with "A" and page size gear up at 5. The program below demonstrates how to fill up in the form and submit it to the server. We get-go with importing the packages we demand for the job. Nosotros peculiarly need to import Select module because we demand to select the 5th grade from a dropdown menu.

                                                      from                    selenium                    import                    webdriver                                      from                    selenium.webdriver.support.select                    import                    Select

Nosotros next need to go through the set-up steps, including creating a Chrome WebDriver instance and then using it to load the example website of a search form. Details of the gear up-up procedure is covered in department 4.1.

                                  driver                    =                    webdriver.Chrome('YOUR_PATH_TO_chromedriver.exe_FILE')                  form_url                    =                    "https://iqssdss2020.pythonanywhere.com/tutorial/form/search"                                    driver.get(form_url)

We and then input "A" into the text box whose ID is "search_name":

                                  commuter.find_element_by_id('search_name').send_keys("A")

Next, we select the 5th grade from the dropdown card whose ID is "search_grade":

                                  Select(driver.find_element_by_id("search_grade")).select_by_visible_text("5")

We then set the page size at 5.

                                  driver.find_element_by_id('p5').click()

We then check the two policy status checkboxes:

                                  driver.find_element_by_id("privacypolicy").click()                  driver.find_element_by_id("termsconditions").click()

Finally, nosotros submit the information we have filled in the search course to the server:

                                  driver.find_element_by_id("search").click()

Fill in the form many times

At present suppose that we want to search all the students. This will crave u.s.a. to make full in the form many times, each fourth dimension with changing input. We have to first play with the website to run into if the form webpage volition be refreshed every fourth dimension we access it. This volition determine how we are going to write our code. If the form page is refreshed every fourth dimension we admission information technology, then nosotros accept to refill all the form fields each time even if near of those fields are repetitive inputs. The program below shows an example of this employ-example.

Nosotros first wrap the lawmaking of filling in the form only in one case into a separate function called as fill_in_form_once() with three parameters – the commuter object, the get-go letter of a pupil's name, and the grade. Nosotros demand to have the latter two parameters to the function considering each time we fill up in the form we fill in the changing input in these ii fields but fill in the same input in all other fields.

                                                      def                    fill_in_form_once(driver, letter, form):                                      commuter.find_element_by_xpath('//*[@id="search_name"]').send_keys(letter)                                      driver.find_element_by_xpath('//*[@id="search_grade"]/pick[                    {}                    ]'.format(grade)).click()                                      commuter.find_element_by_id('p5').click()                                      driver.find_element_by_id("privacypolicy").click()                                      driver.find_element_by_id("termsconditions").click()                                      driver.find_element_by_xpath('//*[@id="search"]').click()                                      time.sleep(5)                                      return                    driver

One time we submit our course, usually nosotros will get a outcome tabular array on a page with each row representing a student record that satisfies the search criteria. So, nosotros write downward another separate office called as scrape_table_this_page() that scrapes all pupil records in the table on this page. It takes the commuter object as the parameter and returns all student records in the result table on this page, stored in a list of dictionaries. Section four.iii.ane covers details of how to scrape a table.

                                                      def                    scrape_table_this_page(driver):                                      students_this_page                    =                    list()                                      table                    =                    driver.find_element_by_xpath('//*[@id="results"]/table')                                      entries                    =                    table.find_elements_by_tag_name("tr")                                      for                    i                    in                    range(1,                    len(entries)):                                      student_dict                    =                    dict()                                      cols                    =                    entries[i].find_elements_by_tag_name("td")                                      student_dict["proper name"]                    =                    cols[0].text                                      student_dict["grade"]                    =                    cols[1].text                                      student_dict["gpa"]                    =                    cols[two].text                                      students_this_page.append(student_dict)                                      return                    students_this_page

At present, nosotros put all these things together and write a complete program. We starting time with importing the packages needed for the job and going through the set-upwards procedure as usual:

                                                      from                    selenium                    import                    webdriver                                      import                    time                                      import                    cord                                      import                    pandas                    as                    pd                                    driver                    =                    webdriver.Chrome('YOUR_PATH_TO_chromedriver.exe_FILE')                  searchAddress                    =                    "https://iqssdss2020.pythonanywhere.com/tutorial/grade/search"                                    driver.get(searchAddress)                  time.sleep(2)

The code and then moves to the main function. We kickoff create a list to store the final search result. We use a double loop in which we loop over each grade for a given letter so loop over each letter. In the inner well-nigh loop, we telephone call function fill_in_form_once() with a given grade for a given letter, which fills in all the fields of the form and submits it to the server.

It does not guarantee that every search will return any student records that satisfy the search criteria. Then, nosotros use try and except (the outer pair) to handle the state of affairs when there is no result returned.In either case, when a single search is over, we refresh the search class (driver.get(searchAddress)) and become to the next new search.

Then let us focus on what happens inside the while loop. We first scrape all student records in the result tabular array on one page by calling the scrape_table_this_page() office, and append the results on the last list of result. Because nosotros set the folio size at v, 5 rows volition show up on a page. And so, if more than than v student records satisfy the search criteria, the search result will have multiple pages. We do not know in accelerate virtually how many pages in that location are. This is why we use a while loop instead of a for loop.

Nosotros but know that we can motion to the next page except that either the result is independent in a single page or the electric current page is the final page. We use the try and except pair within the while loop to handle the situation when there is no next page. In the case that at that place is no next folio, nosotros suspension out of the while loop because nosotros have reached the finish of the results for this search. We either have scraped all results that are contained in a single page or have moved to the final page page-by-page and scraped all results.

                                  students                    =                    list()                                                        for                    letter                    in                    string.ascii_uppercase:                                      for                    grade                    in                    range(2,8):                                      commuter                    =                    fill_in_form_once(driver, alphabetic character, grade)                                      try:                                      while                    True:                                      students_this_page                    =                    scrape_table_this_page(commuter)                                      students.extend(students_this_page)                                      try:                                      driver.find_element_by_xpath('//*[@id="adjacent"]').click()                                      time.sleep(2)                                      except:                                      break                                                        driver.get(searchAddress)                                      time.sleep(two)                                      except:                                      print("No results for letter                                        {}                                          at grade                                        {}                    ".format(letter, grade                    -                    2))                                      driver.become(searchAddress)                                      time.sleep(2)

Finally, we store the list of terminal results into a Pandas data frame and close our driver.

                                  students_df                    =                    pd.DataFrame.from_records(students)                                      print(students_df)                  driver.close()

Lines of code in this program — driver.become(searchAddress) — indicate that the form will be refreshed whenever we finish the current search. This could happen either when we observe out there is no result for this search or when we have scraped the results for this search. Since the form is refreshed every time for a new search, we demand to refill the course every time when a new search starts. This is why we call the fill_in_form_once(driver, letter, grade) function that executes the course filling deportment from the most inner layer of the for loop.

To know what we need to input for a new search, we have to find out where the search has gone and then far up to this point when the form is refreshed for a new search. We control the moving of rounds of search by using the indexing of a list. Nosotros put all the options for a course field into a created list with the exact order of the list elements equally that being displayed in the form field. When nosotros loop through this list, we control where the search runs through. In the above instance, lines of code — for letter of the alphabet in string.ascii_uppercase and for grade in range(2,viii) — play this part. They command the indexing of the proper name listing and the course list. The name field and the course field are the simply two fields whose input values volition change for a new search. This is why we need to create a list for them and so loop over it rather than doing the same thing for all the other class fields.

Another scenario is that the course page is not refreshed for a new search. In this case, and then nosotros practise non need to refill the form fields that volition not alter their values in a new search. Below is the program that performs the same task –– search all students –– but in the use-case where the course is not refreshed. Some chunks of code are repetitive of the final programme, and so I just show the part of code that is unlike as beneath:

                                  students                    =                    list()                                    driver.find_element_by_id('p5').click()                  driver.find_element_by_id("privacypolicy").click()                  driver.find_element_by_id("termsconditions").click()                                                        for                    letter                    in                    string.ascii_uppercase:                                      driver.find_element_by_xpath('//*[@id="search_name"]').clear()                                      driver.find_element_by_xpath('//*[@id="search_name"]').send_keys(letter)                                      for                    grade                    in                    range(two,eight):                                      driver.find_element_by_xpath('//*[@id="search_grade"]/option[                    {}                    ]'.format(grade)).click()                                      driver.find_element_by_xpath('//*[@id="search"]').click()                                      fourth dimension.slumber(5)                                      try:                                      while                    True:                                      students_this_page                    =                    scrape_table_this_page(driver)                                      students.extend(students_this_page)                                      endeavour:                                      commuter.find_element_by_xpath('//*[@id="adjacent"]').click()                                      fourth dimension.slumber(2)                                      except:                                      suspension                                                        except:                                      print("No results for alphabetic character                                        {}                                          at grade                                        {}                    ".format(letter, form                    -                    2))

There is no code in this program that refreshes the form folio. We put the lines of code that make full in the abiding form fields outside of the loop and so that those constant fields will not be refilled in every time for a new search. The alter of the positions of relevant lines of the code comparison the program in the first scenario reflects this idea.

For the course fields with changing input, like student name and form, I pull the part of lawmaking that inputs a student name out of the most inner loop and put it in the outer loop considering for a given first letter of the alphabet of a student name, only the grade field changes in a search and we do not demand to refill the name field. When the proper name field needs to be updated in a search, we have to starting time clear the input box since in this case the search class is not refreshed.

`ElementNotInteractableException`

In some cases when the element is not interactable, actions on it every bit introduced in the to a higher place sections do not work and nosotros are probable to run across an ElementNotInteractableException. This is caused when an element is found, but we cannot collaborate with it — for example, we may non exist able to click or send keys. There could be several reasons for this scenario:

The chemical element is non visible / not displayed.
The element is off screen.
The element is backside some other element or subconscious.
Some other action needs to be performed by the user kickoff to enable the chemical element.

There are strategies that may work to make the element interactable, depending on the circumstance.

Wait until clickable

If the element has not been fully downloaded even so, we tin can wait until the chemical element is visible / clickable. Await at the following instance in which we desire to get the profile for the 5th grade student named "Adams". We first fill in the search form with letter "A" and grade 5 and submit it to the server, as before. This part of code is already provided under the section "Fill up in the form one time". After we submit the search form to the server, Usually, we will get a result table returned. We then need to locate the link to the profile of stduent "Adams" in the effect table. Below is how we continue to code this subsequently submitting the search form. This is the set up chunk of lawmaking:

                                  driver.find_element_by_id("search").click()                  table                    =                    driver.find_element_by_xpath('//*[@id="results"]/table')                  entries                    =                    table.find_elements_by_tag_name("tr")                  fields                    =                    entries[1].find_elements_by_tag_name('td')                  fields[iii].find_element_by_tag_name("a").click()

The above code produces an error message — "no such chemical element: Unable to locate chemical element" — because the result table has non been fully downloaded all the same. Selenium WebDriver provides two types of waits to handle information technology –– explicit and implicit wait. The time.sleep() method is an explicit await to set the status to be an exact time period to wait, as the code beneath shows:

                                                      import                    time                                    driver.find_element_by_id("search").click()                  time.sleep(3)                  table                    =                    commuter.find_element_by_xpath('//*[@id="results"]/table')                                      # same as the set upwards chunk of code                                    ...

As discussed in the previous affiliate, a more than efficient solution would be to make WebDriver expect only as long equally required. This is also an explicit wait merely more efficient than time.sleep(). The code beneath uses the presence of the resulting table element with id "resulttable" to declare that the folio has been fully loaded:

                                                      from                    selenium.webdriver.common.by                    import                    Past                                      from                    selenium.webdriver.support.ui                    import                    WebDriverWait                                      from                    selenium.webdriver.back up                    import                    expected_conditions                    equally                    EC                                    commuter.find_element_by_id("search").click()                  table                    =                    WebDriverWait(driver,                    10).until(EC.presence_of_element_located((By.ID,                    "resulttable")))                                      # same as the prepare chunk of code                                    ...

The final solution is to use an implicit wait, which tells WebDriver to poll the DOM for a certain amount of time when trying to find whatsoever element(south) non immediately bachelor. The default setting is 0. Once set, the implicit wait is gear up for the life of the WebDriver object.

                                  driver.implicitly_wait(10)                                    driver.find_element_by_id("search").click()                                      # same as the ready up chunk of lawmaking                                    ...

Execute JavaScript

On this dynamic search form with a hidden field webpage, we tin can see that the class submission push is hidden. If nosotros still employ the button click() method, we will get an fault bulletin saying that "chemical element non interactable". In this case, we can opt to execute JavaScript that interacts directly with the DOM:

                                                      # same as the fix clamper of code                                    ...                  driver.implicitly_wait(10)                  form_url                    =                    "https://iqssdss2020.pythonanywhere.com/tutorial/formhidden/search"                                                        # aforementioned as the prepare clamper of lawmaking                                    ...                  driver.execute_script(f'document.getElementById("search").click();')                                      # same as the set up chunk of code                                    ...

Perform preliminary action(s)

Let u.s. again search for all the 5th grade students whose name starts with "A" on this dynamic consummate search form webpage. If we move our cursor over a pupil proper noun, we will come across a hover box showing up above the name. Suppose that we desire to scrape the information on the hover box. The hover box is not actionable unless we mover the cursor to a student name to enable it. Once we do that, if we inspect the webpage, nosotros will come across the hover box element has been added to its DOM structure. Then, nosotros tin can scrape the information on the hover box from there. In the post-obit lawmaking segment we scrape the content on the hover box of student "Adams":

                                                      # same as the set up chunk of code                                    ...                                      from                    selenium.webdriver.common.action_chains                    import                    ActionChains                                                        # same as the set up chunk of lawmaking                                    ...                  driver.implicitly_wait(10)                                      # same as the fix chunk of code                                    ...                  name_tag                    =                    fields[0].find_element_by_tag_name("span")                  hov                    =                    ActionChains(driver).move_to_element(name_tag)                  hov.perform()                  hov_id                    =                    name_tag.get_attribute("aria-describedby")                                      print(hov_id)                  hov_text                    =                    driver.find_element_by_id(hov_id).text                                      print(hov_text)

We get-go need to import the ActionChains class in Selenium WebDriver. Nosotros create an ActionChains object past passing the driver object. We so notice the student name "Adams" object in the page and move the cursor on this object using the method move_to_element(). Nosotros then use the method perform() to execute the actions that we have built on the ActionChains object. In this example, this action makes the hover box appear higher up the educatee name. Once this is done, the hover box element is added to the DOM structure of the folio. By inspecting this new addition in the DOM structure, we can find the ID of the hover box through attribute "aria-describedby" and therefore scrape the content of the hover box object associated with that ID.

barnettetivere.blogspot.com

Source: https://iqss.github.io/dss-webscrape/filling-in-web-forms.html