mardi 4 août 2015

Extract part of a url using pattern matching in python

I want to extract part of a url using pattern matching in python from a list of links

Examples:

http://ift.tt/1KNZ7nE  
http://ift.tt/1g4Fr1i   

This is my regex:

re.match(r'(http?|ftp)(://[a-zA-Z0-9+&/@#%?=~_|!:,.;]*)(.\b[a-z]{1,3}\b)(/about[a-zA-Z-_]*/?)', str(href), re.IGNORECASE)  

I want to get links ending only with /about or /about/
but the about regex selects all links with "about" word in it



via Chebli Mohamed

Aucun commentaire:

Enregistrer un commentaire